Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Manual testing on complex documents. A big legal contract for example. An issue can be referred to in 7 different places in a 100 page document. Does it give a coherent answer?

A handful of examples show whether it can do it. For example, GPT-4 turbo is downright awful at something like that.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: