vwxyzjn / direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

vwxyzjn/direct-preference-optimization Stargazers