Loss calculation always 0

Question

Loss calculation always 0

sanipanwala opened this issue 6 months ago · comments

Hello,

I'm trying to fine-tune the 34B model but during fine-tuning, I always get a loss 0. While I was able to fine-tune 7B and 13B models but not 34B.

Let me know if I'm overlooking this or please give me suggestions.

Thanks.

Jonas Gehring · Answer 1 · Wed Feb 28 2024 15:20:19 GMT+0800 (China Standard Time)

Hi @sanipanwala, we don't provide support for fine-tuning in this repository. Which tools are you using for this? Are you sure they support the 34B model well? The exact same setting works for 7B and 13B? In any case, a loss of 0 at the start of training is a good indication that something's going wrong.

sanipanwala · Answer 2 · Wed Feb 28 2024 17:16:46 GMT+0800 (China Standard Time)

@jgehring I mean I'm using "codellama/CodeLlama-34b-hf" model and running a normal Python script and yes same configuration works with 7B and 13B.

Thanks.

ssszh · Answer 3 · Wed Apr 03 2024 22:19:06 GMT+0800 (China Standard Time)

@sanipanwala
Hi, have you solved this problem yet?

I found the same problem when trying to peft fine-tune CodeLLama-7B (using LlamaForSequenceClassification), the Loss is always 0 during the fine-tuning.

Thanks！

sanipanwala · Answer 4 · Thu Apr 04 2024 11:30:38 GMT+0800 (China Standard Time)

Hi @sssszh ,

No, I haven't found any solution yet.

Thanks,
Sani