ValueError: Unsloth: Untrained tokens found, but embed_tokens & lm_head not trainable, causing NaNs. when finetuning llama3 on customed dataset

Question

ValueError: Unsloth: Untrained tokens found, but embed_tokens & lm_head not trainable, causing NaNs. when finetuning llama3 on customed dataset

liwd190019 opened this issue 3 months ago · comments

I want to implement llama 3 on multi-turn dialogue task, so I was trying to finetune it on one of my customed dataset, which is made by simply extracting all the dialogue contents from soda and reformat it to be in llama3's chat template.

I think the result of this dataset is quite similar to the original. But when I tried to train the model, I got the following error:

ValueError: Unsloth: Untrained tokens found, but embed_tokens & lm_head not trainable, causing NaNs. Restart then add `embed_tokens` & `lm_head` to `FastLanguageModel.get_peft_model(target_modules = [..., "embed_tokens", "lm_head",])`

As it indicates, it's because there're some untrainable tokens. Though this bug can be fixed by following the hints, I just can't figure out why I introduced those untrainable tokens. After all, the two datasets (my customed one and the default one in the notebook) look very similar.

Here is a link to the colab, feel free to comment and give advice!

Wenda Li · Answer 1 · Tue Jun 18 2024 04:26:52 GMT+0800 (China Standard Time)

Also can anyone suggest a good inference method to test the finetuned multi-turn chatbot? Currently, the inference code in the notebook for chatbot only supports 1-turn.

from unsloth.chat_templates import get_chat_template

tokenizer = get_chat_template(
    tokenizer,
    chat_template = "llama-3", # Supports zephyr, chatml, mistral, llama, alpaca, vicuna, vicuna_old, unsloth
    mapping = {"role" : "from", "content" : "value", "user" : "human", "assistant" : "gpt"}, # ShareGPT style
)

FastLanguageModel.for_inference(model) # Enable native 2x faster inference

messages = [
    {"from": "human", "value": "Continue the fibonnaci sequence: 1, 1, 2, 3, 5, 8,"},
]
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = True,
    add_generation_prompt = True, # Must add for generation
    return_tensors = "pt",
).to("cuda")

outputs = model.generate(input_ids = inputs, max_new_tokens = 64, use_cache = True)
tokenizer.batch_decode(outputs)

Sumuk Shashidhar · Answer 2 · Tue Jun 18 2024 06:32:27 GMT+0800 (China Standard Time)

I started to get this error today too, strange...

dmitrii-palisaderesearch · Answer 3 · Tue Jun 18 2024 15:19:10 GMT+0800 (China Standard Time)

Try using the Instruct model ("unsloth/llama-3-8b-Instruct-bnb-4bit").

Daniel Han · Answer 4 · Tue Jun 18 2024 15:41:47 GMT+0800 (China Standard Time)

@sumukshashidhar @liwd190019 Apologies! As @dmitrii-palisaderesearch mentioned, please use the instruct version - using the base version will error out, since Unsloth does auto checking if some tokens are all 0s - if you still want to use the base model, either do not use the llama-3 chat template (just use Alpaca), or train on lm_head and embed_tokens

Enes Bol · Answer 5 · Fri Jul 19 2024 17:47:17 GMT+0800 (China Standard Time)

@danielhanchen Do you know why it started to throw this error when it was successfully training with the base model a couple of months ago?

Daniel Han · Answer 6 · Sun Jul 21 2024 04:20:30 GMT+0800 (China Standard Time)

I added a check in Unsloth to check if your embeddings are untrained - I might have to change the logic actually