Also, the notion that you can’t have ...

My iPhone battery

My iPhone Wi-Fi

Current motion

Click for current location

Jonathan LaCour

June 11, 2024

Also, the notion that you can’t have fantastic results with generative models training on only content you have permission to use is ridiculous. OpenAI and Meta are bad actors that are disengenuous. Its *easier* to get good results with ethical shortcuts, but you can acehieve amazing results without stealing.

1 star 3 comments

VM (Vicky) Brasseur liked this post

Jun 11 2024 on cleverdevil.club

@cleverdevil This is 100% correct. Properly curated training data (which we have not yet seen) will yield dramatically better LLM results. Not more reliable, mind you, but should avoid some of the creepy and dark stuff we have seen emerge. Curating requires humans and will be expensive.

fgtech, Jun 12 2024 on micro.blog

@fgtech one thing that I see becoming more common is large foundation models trained on open data sets that are mostly used to provide the fundamentals of written communication, combined with specialized models trained on smaller data sets for very specific use cases. This gives you the best of both worlds.

cleverdevil, Jun 12 2024 on micro.blog

@cleverdevil Sounds great! Ethically sourced, transparent data sources will be key to cleaning up these models.

fgtech, Jun 12 2024 on micro.blog

Jonathan's Location at Posting (click the marker to see more)

Also on: [email protected]

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.