Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No, an autoregressive language model is conditioned on all prior states, not the previous one.


Multiply out the states, "all prior states" is then the "previous one". Easy to model as Markov chain.


Also 'easy' to model as a lookup table containing all possible solutions.


this is technically true but the Markov chain would be too big to store even with petabytes of storage.


Indeed. The argument boils down to: since it's finite, I can turn it into a FSA. Not only is that unhelpful, it doesn't tell you how to construct it, i.e. the learning process.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: