Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To add to the discussion, from a practical perspective, AMD hardware totally sucks and yet to have proper implementation with flash-attention-2. ROCm is moving to usable slowly, but not close to being even comparable with cuda.


Whi os it so hard to port FA2 to the m1300 instinct?





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: