Question about full-parameter finetuning
dydxdt opened this issue · comments
Thx for your great work!
I have a question about training arguments, i.e. is the parameter max_steps=10000 proper for full-parameter finetuning?
I use my own train datset for full-parameter finetuning and the dataset has around 240,000 data. After I use the default training setting to train the model, I see the training log shows : "epoch: 0.32", which means it uses 1/3 data of the training data. My training dataset contains 3 different tasks(caption,ocr,...). Then I use the num_train_epochs(=5, same as qwen) instead of max_steps to train, but I found the model with 5 epochs performs worse than that with 10000 steps when testing on my caption testset. The loss seems normal. So Can you give some advice for this situation? Thx!
请问你使用了多少资源进行全参数微调,我使用2张v100和4张v100均不行