Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Hardware LLM at 16K Tokens/s (taalas.com)
2 points by gcollard- 55 days ago | hide | past | favorite | 1 comment


Testing this hardware LLM (LLAMA 3.1 8B on a chip) I get ~16k tokens per second.

With frontier models plateauing, I’ve been convinced AI will end up like bitcoin mining, and that NVIDIA’s general-purpose GPUs will be replaced by model-specific chips.

Glad to see someone innovating in this space.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: