More

ekropotin · 2026-03-21T21:18:52 1774127932

VRAM vs UM is not exactly apples to apples comparison.

ekropotin · 2026-03-21T21:17:37 1774127857

I’m not very well versed in this domain, but I think it’s not going to be “VRAM” (GDDR) memory, but rather “unified memory”, which is essentially RAM (some flavour of DDR5 I assume). These two types of memory has vastly different bandwidth.

I’m pretty curious to see any benchmarks on inference on VRAM vs UM.

banana_giraffe · 2026-03-22T01:52:30 1774144350

A quick benchmark using float32 copies using torch cuda->cuda copies, comparing some random machines:

    Raptor Lake + 5080: 380.63 GB/s
    Raptor Lake (CPU for reference): 20.41 GB/s
    GB10 (DGX Spark): 116.14 GB/s
    GH200: 1697.39 GB/s

This is a "eh, it works" benchmarks, but should give you a feel for the relative performance of the different systems.

In practice, this means I can get something like 55 tokens a sec running a larger model like gpt-oss-120b-Q8_0 on the DGX Spark.

ekropotin · 2026-03-22T02:20:56 1774146056

Nice! Thanks for that.

55 t/s is much better than I could expect.

oofbey · 2026-03-21T21:28:43 1774128523

I’m using VRAM as shorthand for “memory which the AI chip can use” which I think is fairly common shorthand these days. For the spark is it unified, and has lower bandwidth than most any modern GPU. (About 300 GB/s which is comparable to an RTX 3060.)

So for an LLM inference is relatively slow because of that bandwidth, but you can load much bigger smarter models than you could on any consumer GPU.

ekropotin · 2026-03-21T21:03:20 1774127000

IDK, I feel it’s quite overpriced, even with the current component prices.

I almost sure it’s possible to custom build a machine as powerful as their red v2 within 9k budget. And have a lot of fun along the way.

lostmsu · 2026-03-21T21:06:07 1774127167

AMD now has 32 GiB Radeon AI Pro 9700. 4 of these (just under 2k each) would put you at 128 GiB VRAM

ekropotin · 2026-03-21T21:21:18 1774128078

VRAM is not everything - GPU cores also matter (a lot) for inference

lostmsu · 2026-03-21T21:29:07 1774128547

4x Radeon will have significantly more GPU power than say Mac Studio or DGX Spark.

cyanydeez · 2026-03-21T22:44:43 1774133083

inference speed is like monitor Hz; sure, you go from 60 to 120Hz and thats noticeable, but unless your model is AGI, at some point you're just generating more code than you'll ever realistically be able to control, audit and rely on.

So, context is probably more $/programming worth than inference speed.

ekropotin · 2026-03-20T15:17:26 1774019846

How do you know that? Scientists tried to measure Chuck Norris’ age. The number refused to exist.

ekropotin · 2026-03-20T15:13:23 1774019603

Clickbait. He is not dead, he just decided to retire from the world of mortals.

ekropotin · 2026-03-20T00:45:39 1773967539

Gemini has similar bug https://github.com/google-gemini/gemini-cli/issues/1028, that essentially made this tool absolutely unusable for me.

Never had this problem with Claude tho. Must be something environment-specific.

ekropotin · 2026-03-18T20:03:25 1773864205

So basically tmuxinator?

ekropotin · 2026-03-18T14:44:38 1773845078

IDK if it can be applied in all situations.

Sometimes, especially when it comes to distributed systems, going from working solution to fast working solution requires full blown up redesign from scratch.

ekropotin · 2026-03-16T16:25:09 1773678309

Let me guess - another article about how CLI s are superior to MCP?

kayig · 2026-03-16T22:22:06 1773699726

I k know

ekropotin · 2026-03-15T00:34:41 1773534881

The first link looks very suspicious

itintheory · 2026-03-15T00:51:01 1773535861

Appears to be where the actual link, http://partnerportal.anthropic.com/s/partner-registration, redirects. Site.com is some Salesforce related domain.

OJFord · 2026-03-15T10:19:08 1773569948

Huh, so you got http; I'm now getting linked to: https://partnerportal.anthropic.com/s/partner-registration

Which Firefox warns me has an untrusted cert.

cyanydeez · 2026-03-15T17:13:45 1773594825

Classic vibe coding, everyone involved in AI has blinders when it comes to their dogfood.

nerdsniper · 2026-03-15T03:18:46 1773544726

Yes, that’s why I linked where I found it. Anyone suspicious can click through to it from the anthropic.com page. It’s the correct link though.