when i use trlx ppotrainer train a model llama 13b model, but saved huggingface mode ,but when it inference , it has some strange keys ,and the inference result did not show ,it also have no error , it seems the result disapper

Question

ldh127 opened this issue 6 months ago · comments

trainer = trlx.train(
reward_fn=reward_fn,
prompts=prompts,
eval_prompts=["***女儿"] * 4,
config=config,
)

trainer.save_pretrained('./rl_saved_finished_hf_1202', safe_serialization=False, heads_only=True)

the model can not inference right, it has no error ,but the result also disapper ,the code exit 0

No response

No response

Promise Osaine Ekpo · Answer 1 · Sat Feb 24 2024 16:23:23 GMT+0800 (China Standard Time)

Hey @ldh127 , did you manage to get around this ? I am having a similar issue at the moment.

Regards,
Promise.