[BUG] <title>全参微调时的总batchsize是多少？

Question

aoji0606 opened this issue a month ago · comments

No response

No response

No response

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):

No response

aoji · Answer 1 · Tue Jun 11 2024 10:54:38 GMT+0800 (China Standard Time)

我用自己的数据去ft，执行的finetune_ds.sh，设置的总bs=8*16，lr=1e-6，训完loss在0.8713，最后测试mme效果非常差，这是什么原因呢

qianyu chen · Answer 2 · Fri Jun 14 2024 09:17:27 GMT+0800 (China Standard Time)

您的数据集和mme的差别有多大呢，您有测试过其他indomain数据集吗？我这里用单卡 batch为16，在refcoco上训练，最起码是有效果的。