More

Havoc · 2026-02-21T19:15:16 1771701316

Are people buying mac minis to run the models locally?

kylecazar · 2026-02-21T19:23:38 1771701818

They're buying Mac Minis to isolate the environment in which their agents operate. They consume little power and are good for long running tasks.

Most aren't running models locally. They're using Claude via OpenClaw.

It's part of the "personal agent running constantly" craze.

mystifyingpoi · 2026-02-21T19:28:31 1771702111

For a machine that must run 24/7 or at least most of the day, the next best alternative to a separate computer is a cheap Linux VPS. Most people don't want to fiddle with such setup, so they go for Mac Minis. Even the lower spec ones are good enough, and they consume little power when idle.

botusaurus · 2026-02-21T21:01:55 1771707715

many websites block access from cloud ips - reason why openclaw creator recommended a local one

znnajdla · 2026-02-21T19:25:43 1771701943

No they’re buying them as a home server. You can’t message your claw if your laptop lid is closed.

Havoc · 2026-02-22T01:55:16 1771725316

A $100 minipc would do that just as well though? Mac minis are pricey if all you're doing is have it sit an process a couple API calls now and again

Havoc · 2026-02-21T17:49:13 1771696153

Great moment to break into the market if you're willing to forfeit profits

Havoc · 2026-02-21T14:04:22 1771682662

And the US circus continues

Havoc · 2026-02-21T13:32:26 1771680746

It was pretty stacked by age even during the vote to leave.

Unfortunately the UK has a voting cohort that is both large and willing to screw over subsequent generations.

Havoc · 2026-02-20T11:15:07 1771586107

That seems promising for applications that require raw speed. Wonder how much they can scale it up - 8B model quantized is very usable but still quite small compared to even bottom end cloud models.

Havoc · 2026-02-19T23:49:12 1771544952

The amount of "It's not X it's Y" type commentary suggests to me that A) nobody knows and B) there is solid chance this ends up being either all true or all false

Or put differently we've managed to hype this to the moon but somehow complete failure (see studies about zero impact on productivity) seem plausible. And similarly kills all jobs seems plausible.

That's an insane amount of conflicting opinions being help in the air at same time

pseudosavant · 2026-02-20T00:14:47 1771546487

This reminds me of the early days of the Internet. Lots of hype around something that was clearly globally transformation, but most people weren't benefiting hugely from it in the first few years.

It might have replaced sending a letter with an email. But now people get their groceries from it, hail rides, an even track their dogs or luggage with it.

Too many companies have been to focused on acting like AI 'features' have made their products better, when most of them haven't yet. I'm looking at Microsoft and Office especially. But tools like Claude Code, Codex CLI, and Github Copilot CLI have shown that LLMs can do incredible things in the right applications.

AntiDyatlov · 2026-02-20T19:49:41 1771616981

It's possible we actually never had good metrics on software productivity. That seems very difficult to measure. I definitely use AI at my job to work less, not to produce more, and Claude Code is the only thing that has enabled me to have side-projects (had never tried it before, I have no idea how there are people with a coding full time job that also have a coding side project(s)).

andrekandre · 2026-02-20T06:43:55 1771569835

  > zero impact on productivity

i'm sure someone somewhere will find the numbers (pull requests per week, closed tickets per sprint etc) to make it look otherwise...

cheema33 · 2026-02-20T00:13:02 1771546382

You appear to have said a lot. Without saying anything.

rester324 · 2026-02-20T10:46:16 1771584376

You appear to have written a lot. Without understanding anything.

Havoc · 2026-02-19T15:57:15 1771516635

I still can't believe anyone in the industry measures it like:

>from under 25 minutes to over 45 minutes.

If I get my raspberry pi to run a LLM task it'll run for over 6 hours. And groq will do it in 20 seconds.

It's a gibberish measurement in itself if you don't control for token speed (and quality of output).

dcre · 2026-02-19T16:00:43 1771516843

Tokens per second are similar across Sonnet 4.5, Opus 4.5, and Opus 4.6. More importantly, normalizing for speed isn't enough anyway because smarter models can compensate for being slower by having to output fewer tokens to get the same result. The use of 99.9p duration is a considered choice on their part to get a holistic view across model, harness, task choice, user experience level, user trust, etc.

Havoc · 2026-02-19T23:41:32 1771544492

>Tokens per second are similar across Sonnet 4.5, Opus 4.5, and Opus 4.6.

This may come as a shock, but there are LLMs not authored by anthropic and when we do measurements we may want them to be comparable across providers

saezbaldo · 2026-02-19T17:45:39 1771523139

The bigger gap isn't time vs tokens. It's that these metrics measure capability without measuring authorization scope. An agent that completes a 45-minute task by making unauthorized API calls isn't more autonomous, it's more dangerous. The useful measurement would be: given explicit permission boundaries, how much can the agent accomplish within those constraints? That ratio of capability-within-constraints is a better proxy for production-ready autonomy than raw task duration.

visarga · 2026-02-19T17:32:32 1771522352

I agree time is not what we are looking for, it is maximum complexity the model can handle without failing the task, expressed in task length. Long tasks allow some slack - if you make an error you have time to see the outcomes and recover.

Havoc · 2026-02-19T09:59:41 1771495181

Wish the spectrum was a bit more unified globally. Would love to move my IoT off 2.4ghz.

433mhz has restrictions in the UK, so 868mhz is the alternative...but a esp32 that support that is quadruple the price of 2.4ghz and has an unwieldy antenna

Havoc · 2026-02-19T09:26:42 1771493202

>first model that has really broken into the anglosphere.

Do you know of a couple of interesting ones that haven't yet?

kristopolous · 2026-02-19T09:31:56 1771493516

doubao (bytedance) seed models are interesting

Keep your eye on Baidu's Ernie https://ernie.baidu.com/

Artificial analysis is generally on top of everything

https://artificialanalysis.ai/leaderboards/models

Those two are really the new players

Nanbeige which they haven't benchmarked just put out a shockingly good 3b model https://huggingface.co/Nanbeige - specifically https://huggingface.co/Nanbeige/Nanbeige4.1-3B

You have to tweak the hyper parameter like they say but I'm getting quality output, commensurate with maybe a 32b model, in exchange for a huge thinking lag

It's the new LFM 2.5

admiralrohan · 2026-02-19T12:38:25 1771504705

Never heard of Nanbeige, thanks for sharing. "Good" is subjective though, in which tasks can I use it and where to avoid?

kristopolous · 2026-02-19T12:40:46 1771504846

it's a 3b model. Fire it up. If you have ollama just do this:

    ollama create nanbeige-custom -f <(curl https://day50.dev/Nanbeige4.1-params.Modelfile)

That has the hyperparameters already in there. Then you can try it out

It's taking up like 2.5GB of ram.

my test query is always "compare rust and go with code samples". I'm telling you, the thinking token count is ... high...

Here's what I got https://day50.dev/rust_v_go.md

I just tried it on a 4gb raspberry pi and a 2012 era x230 with an i5-3210. Worked.

It'll take about 45 minutes on the pi which you know, isn't OOM...so there's that....

Havoc · 2026-02-19T13:05:08 1771506308

Thanks!

Havoc · 2026-02-18T20:56:56 1771448216

Can’t wait for trump and his gestapo to deport the entirety of nasa for telling the truth

declan_roberts · 2026-02-18T21:06:21 1771448781

Why does NASA even have to do this? Build some cool rockets and get us to mars.

Windchaser · 2026-02-18T21:49:59 1771451399

Among other objectives, NASA's 1958 mission statement includes conducting aeronautical and space activities of the US for "the expansion of human knowledge of phenomena in the atmosphere and space".

So: atmospheric climate science directly falls under NASA's responsibilities.

https://en.wikipedia.org/wiki/National_Aeronautics_and_Space...

retrac · 2026-02-18T21:20:03 1771449603

NASA launches and operates Earth-observing satellites for measuring the weather and climate.

SoftTalker · 2026-02-18T21:10:06 1771449006

Living on Mars long-term is a practical impossibility. Certainly much, much harder than living on even a climate-changed Earth.

charcircuit · 2026-02-18T21:58:05 1771451885

Humans have done a lot of things we once thought were impossible.

k33n · 2026-02-19T00:11:45 1771459905

I’ve been living on a climate-changed earth for my entire life and it’s not been too difficult.

johnea · 2026-02-19T23:28:33 1771543713

Ahh, the elon bois. This form of internet brain damage gives me a good LOL almost everyday 8-)

My old comment on the subject:

"If you want to know what it'll be like living on Mars, bury a cargo container in your back yard, and live in it for a year."

If you run out of something, you'll certainly be closer to the grocery store.

This form of 12 year old pubescent boy SciFi fantasy, taking priority over the many actual problems that need to be solved in the world today, is a poster child for the phenomenon of "the idiot wealthy".

Not everyone participating in the phenomenon are part of the idiot wealthy, most of them aren't rich.

Fourier864 · 2026-02-19T02:30:37 1771468237

Almost everything? Most money for fundamental atmospheric research flows through NASA. People always forget that only half of NASA's budget is for rocketry and human space flight, and the other half is science.