Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes our pairwise method is based entirely on 2AFC comparisons, for both intra-query and inter-query ELO calculations.

It's definitely the best if not only way to get extremely high signal, and a score assignment that actually converges the more you sample.

In terms of the "F" in 2AFC, we actually have this amusing snippet from our prompt:

> Do NOT output a score of 0.0, ensure to focus on which document is superior, and provide a negative or positive float between -1.0 and 1.0.



Nice, I use an epoch to prevent stalemate but this might be better.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: