A simple code for [Nash Learning from Human Feedback](https://arxiv.org/abs/2312.00886)
Repository from Github https://github.combuttercutter/NLHFRepository from Github https://github.combuttercutter/NLHF