At a certain point though, models become good enough for particular tasks. Once ...

londons_explore · on July 11, 2023

I think we're far from that point though. For the vast majority of use cases, I always wish that the answers could be more accurate.

Sure - they might be 'good enough' to build a business on. But if a competitor builds their business on top of a more accurate model, their product will work better, and they will win the market.

unshavedyak · on July 11, 2023

Yea but the bench being discussed here is FOSS. Which for me, and many, translates to can i run something useful in my closet or on my phone. I've found LLaMA neat and yea, some FOSS models are getting decent - but they're a far cry from GPT4. I pay for GPT4, use it almost daily and that's my bench.

Yes, when i can run GPT4 in my closet, OpenAI will have GPT7 or w/e - but it doesn't change the fact that i have something useful running in my closed network and that opens up all kinds of data integration that i'm unwilling to ship to OpenAI. In that day i'll probably still use GPT7, but i'll _also_ have GPT4 running in my closet and integrating with a ton of things on my local network.

mvkel · on July 12, 2023

My guess is you'll be running GPT4 equivalent in your closet, but with a 4K context window.

Where the big guys will have GPT-who-cares-what-version with a 100K context window.

Context size is as much of a big deal as newer generations of models imo.

mptest · on July 12, 2023

Am I right in my layman's understanding that context windows scaling up requires (mainly) much more compute at run time? Or do longer context models require different/longer training?

kmstout · on July 11, 2023

> their product will work better, and they will win the market

Like Betamax?

thelittleone · on July 11, 2023

One important milestone a model that is good enough to produce an acceptable quality of answer to x% of public users questions without any data being sent to the megacorps.