vicgalle / refined-dpo

Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

vicgalle/refined-dpo Stargazers