Continued Pretraining on Llama7b.

Question

Continued Pretraining on Llama7b.

wiseyy opened this issue 4 months ago · comments

I want to do continued pretraining on my custom dataset, using the weights of Llama7b in the HF format. How do I initialize the model with those weights? I think there isn't a function for that yet.

XλRI-U5 · Answer 1 · Thu Feb 22 2024 12:06:40 GMT+0800 (China Standard Time)

Hey, you have to convert it to nanotron checkpoint format!!

Start by randomly initializing a llama model, then save the model checkpoint with dp=2, tp=2, pp=2, and you will see how Nanotron splits it. Then reformat the Hugging Face checkpoint in this way