Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Sure, but what if a normal C code needs more bandwidth?

https://www.anandtech.com/show/17024/apple-m1-max-performanc...



Putting things in context, the 3-thread measured memory bandwidth measured there on the M1 Max is approximately equal to the maximum theoretical memory bandwidth available across all channels and chiplets of a current Ryzen Threadripper PRO. If the M1 Ultra doubles the achievable memory bandwidth by virtue of having twice as many CPU clusters, then an 8-thread test should still be able to match AMD's latest EPYC processors with 12-channel DDR5.

(I don't know how well AMD's current processors do with utilizing the socket's full DRAM bandwidth from a limited number of chiplets, but I wouldn't be surprised if it's a more severe limitation than what M1 Max/Ultra show with their CPU clusters. It looks like only the 12-chiplet EPYC processors actually use all the links from the IO die to the CPU chiplets.)

So the inability to use all the DRAM bandwidth from the CPU cores, while perhaps disappointing, isn't exactly a weakness for Apple's processors compared to the competition.


I believe the chiplet links are pretty generous, they handle L3 <-> chiplet traffice and chiplet <-> chiplet traffic. On the smaller configs they use 2 links per chiplet.

Thinks like McCalpin do not seem to show much difference on the different number of chiplet Epycs, although I've not personally tested the newest Genoa chips.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: