The one thing I don't get is, there are a lot of machines out there that would gain a lot from specialized search hardware (think about Prolog acceleration engines, but lower level). For a start, every database server (SQL or NoSQL) would benefit.
It is also hardware that is similar to ML acceleration, it needs better integer and boolean (algebra, not branching) support, and has a stronger focus on memory access (that ML acceleration also needs, but gains less from). So how comes nobody even speak about this?
You would need large memory bandwidth and a good set of cache pre-population heuristics (putting it directly on the memory is a way to get the bandwidth).
ML would benefit from both too, as would highly complex graphics and physics simulation. The cache pre-population is probably at odds with low latency graphics.
It is also hardware that is similar to ML acceleration, it needs better integer and boolean (algebra, not branching) support, and has a stronger focus on memory access (that ML acceleration also needs, but gains less from). So how comes nobody even speak about this?