What is the main library to scale up RL training for LLMs?

Question

What is the main library to scale up RL training for LLMs?

aldopareja opened this issue a year ago · comments

Assuming you have a reward model (say open assistant reward model) and a target model (say LLaMA), and you want to train it at scale on a multinode setup. What is the best code base for this? DeepSeed-chat?