jamesliu / nanoDPO

A nimble and innovative implementation of the Direct Preference Optimization (DPO) algorithm with Causal Transformer and LSTM model, inspired by the paper of DPO in fine-tuning unsupervised Language Models

Geek Repo

Github PK Tool

jamesliu/nanoDPO Watchers

Kostas Georgiou
drkostas
eemailme
James
jamesliu