Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The "alignment tax".


Exactly. Even this paper shows how model creativity significantly drops and the models experience mode collapse like we saw in GANs, but the companies keep using RLHF...

https://arxiv.org/abs/2406.05587


A nice talk about a researcher's experience/benchmarks with raw GPT-4, before and after RLHF:

https://www.youtube.com/watch?v=qbIk7-JPB2c


Yup, I remember that! Microsoft removed that part of the paper.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: