ise-uiuc / magicoder

Magicoder: Source Code Is All You Need

Home Page:https://arxiv.org/abs/2312.02120

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Are the training loss and validation loss recorded?

shatealaboxiaowang opened this issue · comments

Hi, Dear:

Thank you very much for your code. I am reproducing your training process. I wonder what your training loss and validation loss are during the training process, and I want to align them with your training process on dataset Magicoder-OSS-Instruct-75K and datasets--ise-uiuc--Magicoder-Evol-Instruct-110K

Thx

Thank you for your open source, I am replicating your fine-tuning process according to the code on github. Do the results of train loss=0.16 and eval_loss=0.21 I trained on the 75k dataset match yours? I will continue training on the 110k dataset.
I trained for 4 epochs and indeed started overfitting after the second epoch.

Magicoder-S-CL.json
Magicoder-CL.json
Magicoder-S-DS.json
Magicoder-DS.json

Hi, here are the trainer states. Hope they can help!

Magicoder-S-CL.json Magicoder-CL.json Magicoder-S-DS.json Magicoder-DS.json

Hi, here are the trainer states. Hope they can help!

Thank you very much, my training process is basically the same as your loss metric and the test results on humaneval dataset are basically consistent

But I have a question. Why is the model fully fine-tuned by instruction, but why is the capacity infilinig increased?
I look forward to hearing from you.

Good to hear you can reproduce it. Yeah we did observe that the infilling capability was at least not decreasing. We believe this is because the model learned some general alignment during the instruction tuning, and infilling is a kind of alignment based on the surrounding context. Further study of this phenomenon would be interesting.