It is still an open question whether RL will (at least easily) scale the same wa...

		whimsicalism 3 months ago \| parent \| context \| favorite \| on: Mistral raises 1.7B€, partners with ASML It is still an open question whether RL will (at least easily) scale the same way as pretrain or whether it is more effective at elicitation.