Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Something like Triton from Microsoft/OpenAI as a cuda bypass? Or pytorch/tensorflow targeting ROCm without user intervention.

Or there's openmp or hip. In extremis opencl.

I think the language stack is fine at this point. The moat isn't in cuda the tech. It's in code running reliably on nvidia's stack, without things like stray pointers needing a machine reboot. Hard to know how far off robust rocm is at this point.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: