For a machine that must run 24/7 or at least most of the day, the next best alternative to a separate computer is a cheap Linux VPS. Most people don't want to fiddle with such setup, so they go for Mac Minis. Even the lower spec ones are good enough, and they consume little power when idle.
That seems promising for applications that require raw speed. Wonder how much they can scale it up - 8B model quantized is very usable but still quite small compared to even bottom end cloud models.
The amount of "It's not X it's Y" type commentary suggests to me that A) nobody knows and B) there is solid chance this ends up being either all true or all false
Or put differently we've managed to hype this to the moon but somehow complete failure (see studies about zero impact on productivity) seem plausible. And similarly kills all jobs seems plausible.
That's an insane amount of conflicting opinions being help in the air at same time
This reminds me of the early days of the Internet. Lots of hype around something that was clearly globally transformation, but most people weren't benefiting hugely from it in the first few years.
It might have replaced sending a letter with an email. But now people get their groceries from it, hail rides, an even track their dogs or luggage with it.
Too many companies have been to focused on acting like AI 'features' have made their products better, when most of them haven't yet. I'm looking at Microsoft and Office especially. But tools like Claude Code, Codex CLI, and Github Copilot CLI have shown that LLMs can do incredible things in the right applications.
It's possible we actually never had good metrics on software productivity. That seems very difficult to measure. I definitely use AI at my job to work less, not to produce more, and Claude Code is the only thing that has enabled me to have side-projects (had never tried it before, I have no idea how there are people with a coding full time job that also have a coding side project(s)).
Tokens per second are similar across Sonnet 4.5, Opus 4.5, and Opus 4.6. More importantly, normalizing for speed isn't enough anyway because smarter models can compensate for being slower by having to output fewer tokens to get the same result. The use of 99.9p duration is a considered choice on their part to get a holistic view across model, harness, task choice, user experience level, user trust, etc.
The bigger gap isn't time vs tokens. It's that these metrics measure capability without measuring authorization scope. An agent that completes a 45-minute task by making unauthorized API calls isn't more autonomous, it's more dangerous. The useful measurement would be: given explicit permission boundaries, how much can the agent accomplish within those constraints? That ratio of capability-within-constraints is a better proxy for production-ready autonomy than raw task duration.
I agree time is not what we are looking for, it is maximum complexity the model can handle without failing the task, expressed in task length. Long tasks allow some slack - if you make an error you have time to see the outcomes and recover.
Wish the spectrum was a bit more unified globally. Would love to move my IoT off 2.4ghz.
433mhz has restrictions in the UK, so 868mhz is the alternative...but a esp32 that support that is quadruple the price of 2.4ghz and has an unwieldy antenna
You have to tweak the hyper parameter like they say but I'm getting quality output, commensurate with maybe a 32b model, in exchange for a huge thinking lag
Among other objectives, NASA's 1958 mission statement includes conducting aeronautical and space activities of the US for "the expansion of human knowledge of phenomena in the atmosphere and space".
So: atmospheric climate science directly falls under NASA's responsibilities.
Ahh, the elon bois. This form of internet brain damage gives me a good LOL almost everyday 8-)
My old comment on the subject:
"If you want to know what it'll be like living on Mars, bury a cargo container in your back yard, and live in it for a year."
If you run out of something, you'll certainly be closer to the grocery store.
This form of 12 year old pubescent boy SciFi fantasy, taking priority over the many actual problems that need to be solved in the world today, is a poster child for the phenomenon of "the idiot wealthy".
Not everyone participating in the phenomenon are part of the idiot wealthy, most of them aren't rich.
Almost everything? Most money for fundamental atmospheric research flows through NASA. People always forget that only half of NASA's budget is for rocketry and human space flight, and the other half is science.
reply