RLHF text summarization diverges

Question

RLHF text summarization diverges

AlisonWen opened this issue 5 months ago · comments

🐛 Describe the bug

I am running the experiment of trlx_gptj_text_summarization.py, I have not modified the code but the experiment has not converged when more than 3500 steps, and the document said it was meant to converge. I realized the sample project was running the file trlx_gptneo_text_summarization.py, but I cannot find the file anywhere.

Which trlX version are you using?

download with source code on 2024/01/13

Additional system and package information

linux jammy, torch==2.0.0+cu118