NVIDIA is so far ahead of the competition that people really do not care about the poor API that ROCm offers. Intel is not really a thing to even remotely care about for GPUs, they are still obviously important for CPU end. All the other players matter even less.
I badly wanted something decent as an alternative to CUDA for quite a while, but even after leaving AI myself, it is clear they are not near the standards that NVIDIA has been offering for nearing a decade now already.
We don't even have to get into all the issues consumers have gotten with AMD's GPUs.
I'm not deeply familiar with this space, but isn't training and inference on CPUs gaining some traction[1][2][3]? It's still surely orders of magnitude slower, but as CPUs get more powerful and ML frameworks get more efficient, making this cheaper and more accessible would be a major breakthrough.
CPU training was and continues to be one-tenth of the speed, dollar for dollar. It's not even close.
The inescapable trend is that bigger models are better models. So if you're doing this professionally, you're going to need GPUs not only for training, but increasingly for inference in order to get decent latency.
On the other hand, it's likely competition will catch up to Nvidia.
They were early and good. But the space is too lucrative, and the technology is not special enough that they will be able to keep their dominant position for long.
Nor does anything Khronos related for compute APIs.
While Apple initially promoted OpenCL, Android never supported it as official API rather Renderscript.
So on iOS eventually Apple moved into Metal, usable from Objective-C and Swift, while on Android, Google has deprecated Renderscript usable from Java and Kotlin, pushing everyone to now learn Vulkan compute and do the integration on the NDK by themselves while learning C, C++ and GLSL on the process (as no big deal from their POV).
Basically hardly anyone cares with such great usability story. /s
> Nor does anything Khronos related for compute APIs
As you said below, vulkan compute does work on mobile devices, and while not from Khronos (other than SYCL) there are a bunch of higher level libraries using it.
I don't think Apple is the only mobile vendor. I think that between AMD, Intel, Qualcomm, Mediatek, Samsung and Google, they will realize that they are all being screwed over by Nvidia. Right now ML is still mostly research but when real products start surfacing (or, perhaps, for real products to surface to begin with) they will have to settle on something, wether it's ROCM, OneAPI or something else.
You mean aside from their long history of developing open things, from Webkit (based on the KDE's open basic wkhtml and made into a full-blown browser engine) to Swift?