• 2 Posts
  • 81 Comments
Joined 1 year ago
cake
Cake day: July 4th, 2023

help-circle





  • LLMs performance are getting closer to plateau due to lack of data easily available. OpenAi is going around trying to license some data, but it won’t be enough.

    The company with more touch points with users is better positioned to transform these into Data Probes. Msft has windows, Apple has iOS and Google… Well Google is fucked because the other two have OS level access and can restrict what Google collects.

    Now that LLM Foundation models are out, the game will be “who can get the most data” to retrain, optimise and ultimately monetise these models. And there’s another whole “can of worms” with the legality of training models with unlicensed data collected trough “system snapshots”. I.e.: Collecting NY Times data through windows snapshots of users that visit the site.