Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The answer is that crawling the whole internet is only for training a base model which is expensive and compute-intensive.

R1 didn’t train a base model, they performed additional steps on top of a previously-trained base model (V3). These guys are doing something similar.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: