I imagine these kind of schemes can be implemented as sort of on device eBPF filter (in layman terms CUDA, but for storage). It would allow deeper integration with system for example have hardware accelerated/integrated lvm (obviously speed would depend on use case, less win for thin volumes, more advantages for raid and so on). Or from other side have deeper integration with filesystems such as zfs, btrfs, bcachefs.
We tried to standardize exactly this - eBPF programs offloaded onto the device. The NVMe standard now has a lot of infrastructure for this standardized, including commands to discover device memory topology, transfer to/from that memory, and discover and upload programs. But one of the blockers is that eBPF isn't itself standardized. The other blockers are vendors ready and willing to build these devices and customers ready to buy them in volume. The extra compute ability will introduce some extra cost.
> The NVMe standard now has a lot of infrastructure for this standardized, including commands to discover device memory topology, transfer to/from that memory, and discover and upload programs.
On the other hand, Windows and Linux still cannot just upgrade the vast majority of firmwares on NVMe devices, least of all consumer ones, despite being completely and utterly standardized.
I think i remember upgrading the nvme disk firmware in work dell laptop (dell latitude 7390) from 2019 using fwupd some years ago (not more than 3 years ago).
Also i think i remember fixing (upgrading?) the firmware on a crucial ssd like 5 or 6 years ago using some live linux system (downloaded off the crucial website i think?)
Not sure about windows, but linux is getting incredibly better at this.
The eBPF programs are strictly bounded. And they're scoped to their own memory that you have to pre-load from the actual storage with separate commands issued from the CPU (presumably from the kernel driver which is doing access control checks). It's no different than uploading a shader to a GPU. You can burn resources but that's about the extent of the damage you can cause.
I wouldn't want random applications (or web pages) to be able to load eBPF modules in the same way they can send shaders to a GPU through a graphics driver.
What I don't get is that RAID5 is a simple xor. It should be a trivial operation, that would be equally trivial to hardware accelerate.
What I am the most puzzled by is how parity (i.e. RAID5) is so bad in windows storage space. A modern CPU should be able to xor data at several gigabytes per second. And it seems that even by optimizing the block sizes, windows storage space parity caps at a couple hundred MB/s.