[Feature request] Add ORPO finetuning

Question

s-kostyaev opened this issue 4 months ago · comments

Hi @armbues, thank you for this great project.

Please add ORPO finetuning to do SFT and DPO in one step https://arxiv.org/abs/2403.07691

Armin Buescher · Answer 1 · Mon Apr 15 2024 21:25:42 GMT+0800 (China Standard Time)

Great idea! I will add this to the roadmap.

Armin Buescher · Answer 2 · Thu Apr 25 2024 18:51:45 GMT+0800 (China Standard Time)

The official implementation of the "ORPO Trainer" can be found here.