li-plus / nanoRLHF

Train a tiny LLaMA model from scratch to repeat your words using Reinforcement Learning from Human Feedback (RLHF)

Repository from Github https://github.comli-plus/nanoRLHFRepository from Github https://github.comli-plus/nanoRLHF

li-plus/nanoRLHF Issues

No issues in this repository yet.