Hacker Newsnew | past | comments | ask | show | jobs | submit | androiddrew's commentslogin

Not with the price of silicon being what it is

Where are we at with the rat brain CPUs

We keep losing people to the sewers..some in the organization are speculating they might be building a human brain CPU to retaliate. Progress is slow.

s/people/cpus/

I love local first. I am finding that a 120B MoE is hitting the sweet spot for local hosted. Right now that takes a 2K strix halo, a 4k GB10 machine, or a 5k Mac Pro. 2 years from now I think hardware will take us back to the 2k ish range with good performance.

I love my dual GPU setup (2AMD Radeon r9700 64GB vram) but it costs 5x electricity than my GX10 (GB10 chip inside) and since layers are landing in system memory my TPS is half the GX10.

Now a dense model like Devstral2 24B slaps on the Dual GPU setup. I just haven’t gotten as much out of that as I have the 120 MoEs


Get turboquant 4 bit implemented and this would be game changer.

Wish they had this for zig


Alternative headline: household spyware cash machine forced to pay $20 for being bad.

If you want to punish Meta then you have to punish the wonder boy who runs it. Not even share holders can fight off the guy spending 80B on the metaverse.


You're not wrong, but the problem for Meta is that this, along with their other fine for mental harm is setting a precedence.

This fine is somewhat larger, at $375 million, but the other one (https://www.msn.com/en-us/health/other/meta-and-youtube-fine...) basically open the gates for millions of people suing.

Sadly I don't think it's enough for Meta to change, because they have no business model if they are forced to be serious about online safety. That's probably also why they are pushing so hard for age verification, make safety a problem for someone else.


The poster probably is hoping someone will post the archived version in the comments


I heard a delightful term for building apps only for yourself “houseplant programming”.


Could you share what you are using for inference and how you are running it? I have a 64G VRAM/128G system RAM setup.


Most people are using something in the llama family for inference. Llama server is my go to. Unsloth guides describe how to configure inference for your model of choice.


Good, because it’s fucking ridiculous that pharma gets special patent loop holes to maintain a monopoly beyond what the basic protection grants.


I don’t know why this keeps coming up. Has this been a big deal for everyone else? Like ok usability improvement, but the number of times I have read an article about this is silly.


I doubt this is about the asterisks at this point. It’s about Rust, rewriting working tools in Rust and showing that Rust is the way and the only way.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: