The "alignment tax".

behnamoh · 2025-12-01T21:18:15 1764623895

Exactly. Even this paper shows how model creativity significantly drops and the models experience mode collapse like we saw in GANs, but the companies keep using RLHF...

https://arxiv.org/abs/2406.05587

nomel · 2025-12-01T21:28:42 1764624522

A nice talk about a researcher's experience/benchmarks with raw GPT-4, before and after RLHF:

https://www.youtube.com/watch?v=qbIk7-JPB2c

behnamoh · 2025-12-01T21:30:03 1764624603

Yup, I remember that! Microsoft removed that part of the paper.