CarperAI / trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

when i use trlx ppotrainer train a model llama 13b model, but saved huggingface mode ,but when it inference , it has some strange keys ,and the inference result did not show ,it also have no error , it seems the result disapper

ldh127 opened this issue · comments

🐛 Describe the bug

trainer = trlx.train(
reward_fn=reward_fn,
prompts=prompts,
eval_prompts=["***女儿"] * 4,
config=config,
)

trainer.save_pretrained('./rl_saved_finished_hf_1202', safe_serialization=False, heads_only=True)

the model can not inference right, it has no error ,but the result also disapper ,the code exit 0

Which trlX version are you using?

No response

Additional system and package information

No response

Hey @ldh127 , did you manage to get around this ? I am having a similar issue at the moment.

Regards,
Promise.