Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is not true of goal directed agents and all RLHF models are trained with, ahem, RL; see: "Optimal Policies Tend to Seek Power" from NeurIPS 2021.

It's a very useful instrumental goal to be very powerful.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: