In Multimodal yes, but Opus is definitely edging out in Text/Reasoning and Agentic benchmarks.
I think the general skepticism is because they are late to race, and they are releasing a Opus-4.6-equivalent model now, when Anthropic is teasing Mythos.
They actually need it because the demand is higher than expected from consumers. And because they need a moat since every big corporation trying to capture that market too, they need the moat for the biggest compute and energy they can get.
Also businesses is were the money at, not regular consumers (especially tech-savvy folk who run models locally).
For me it's the opposite. I am a bit surprised that inflation halved buying power since 2000.
In my mind those level of interest usually come from the stock market or house appreciation, but I guess those are much faster (I seem to recall doubling every 8 years in the stock market and housing being a bit slower).
It would be useful if you explain how you calculate it. I mean, if you just apply a decaying exponential function, anyone can do that on their calculator.
SimCity, Starcraft, etc. taught me the value of saving up. But after some decades in the real world, I think computer games should simulate inflation so youth can get some practice at this!
I think the more interesting thing to chew on is how this will look over the next 25 years. The numbers will be... huge. 75k salary in 2000 is similar to 150k salary in 2026, project that out to 2050...
Something feels like it'll give out, but I've felt that way for 8 years at this point and I haven't been correct.
Well the important thing is they have a lot more data of people actually using their models. They have read billions more lines of private repos and implemented millions of patches, all of which is feeding into the newer models.
More importantly it understand what behaviour people tend to appreciate and what changes are more likely to get approved. This real world usage data is invaluable.
Exactly. As Claude increases in popularity, their available training data also increases. I'd guess Anthropic has the most expansive swe training data as of now, if not close. Considering how quickly Claude is penetrating, I expect their lead to grow quickly.
I am not smart with stock legal-ese but I pasting something I found in a different article here.
> To balance index integrity and investability, Nasdaq proposes a new approach for including and weighting low-float securities (those below 20% free float). Each low-float security’s weight will be adjusted to five times its free float percentage, capped at 100%. Securities with more than 20% free float will continue to be weighted at full, eligible listed market capitalization, while those below 20% free float will be weighted proportionally to preserve investability.
> The rule reportedly includes a 5x float multiplier for low-float stocks, which would require passive vehicles to treat SpaceX as if it had significantly more tradable shares than actually exist, essentially forcing funds to chase the price.
It sounds to me like a way to increase demand for low float stocks by treating the float higher than it actually is. Glad to hear the explanations about this.
You are not supposed worry about the mapping. You trust the website to help decode it. You just remember the sentence. It's a little like what3words for coordinates.
The rationale being you are more likely to remember grammatical cogent sentence, than a random string of alphanumeric characters. Although I will agree that the generated sentences don't seem easy to remember. So I doubt it's utility.
I played the demo, but it definitely took me a minute to grok the rules.
I don't know if this is how we want to measure AGI.
In general I believe the we should probably stop this pursuit for human equivalent intelligence that encourages people to think of these models as human replacements. LLMs are clearly good at a lot of things, lets focus on how we can augment and empower the existing workforce.
Also, let's see if we can get the power and compute requirements brought down. Having to spin up a gigawatt power plant to achieve the same intelligence we humans power with sandwiches is a futile approach, imho.
I don't like that I need to login to my FB/Instagram account to access this.
reply