uta-smile / TCL

code for TCL: Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Could you provide the pretrain log? Thanks

longkukuhi opened this issue · comments

Thanks for the great paper and codes. Could you provide the pretrain log so that I can compare it with my result?

commented

Hi, thanks for your interest in our work. I just had a quick look but did found it. Let me double check with team members later and will let you know.

commented

Please check the log below. Note that train_lr doesn't show the full precision.

{"train_lr": "0.000", "train_loss_mlm": "2.017", "train_loss_ita": "2.352", "train_loss_itm": "0.451", "epoch": 0}
{"train_lr": "0.000", "train_loss_mlm": "1.364", "train_loss_ita": "2.032", "train_loss_itm": "0.347", "epoch": 1}
{"train_lr": "0.000", "train_loss_mlm": "1.303", "train_loss_ita": "1.949", "train_loss_itm": "0.316", "epoch": 2}
{"train_lr": "0.000", "train_loss_mlm": "1.265", "train_loss_ita": "1.900", "train_loss_itm": "0.300", "epoch": 3}
{"train_lr": "0.000", "train_loss_mlm": "1.236", "train_loss_ita": "1.857", "train_loss_itm": "0.288", "epoch": 4}
{"train_lr": "0.000", "train_loss_mlm": "1.214", "train_loss_ita": "1.825", "train_loss_itm": "0.279", "epoch": 5}
{"train_lr": "0.000", "train_loss_mlm": "1.193", "train_loss_ita": "1.797", "train_loss_itm": "0.271", "epoch": 6}
{"train_lr": "0.000", "train_loss_mlm": "1.176", "train_loss_ita": "1.782", "train_loss_itm": "0.263", "epoch": 7}
{"train_lr": "0.000", "train_loss_mlm": "1.159", "train_loss_ita": "1.764", "train_loss_itm": "0.257", "epoch": 8}
{"train_lr": "0.000", "train_loss_mlm": "1.142", "train_loss_ita": "1.741", "train_loss_itm": "0.252", "epoch": 9}
{"train_lr": "0.000", "train_loss_mlm": "1.125", "train_loss_ita": "1.729", "train_loss_itm": "0.246", "epoch": 10}
{"train_lr": "0.000", "train_loss_mlm": "1.111", "train_loss_ita": "1.712", "train_loss_itm": "0.241", "epoch": 11}
{"train_lr": "0.000", "train_loss_mlm": "1.095", "train_loss_ita": "1.696", "train_loss_itm": "0.236", "epoch": 12}
{"train_lr": "0.000", "train_loss_mlm": "1.080", "train_loss_ita": "1.683", "train_loss_itm": "0.231", "epoch": 13}
{"train_lr": "0.000", "train_loss_mlm": "1.066", "train_loss_ita": "1.679", "train_loss_itm": "0.226", "epoch": 14}
{"train_lr": "0.000", "train_loss_mlm": "1.052", "train_loss_ita": "1.669", "train_loss_itm": "0.221", "epoch": 15}
{"train_lr": "0.000", "train_loss_mlm": "1.039", "train_loss_ita": "1.655", "train_loss_itm": "0.216", "epoch": 16}
{"train_lr": "0.000", "train_loss_mlm": "1.024", "train_loss_ita": "1.650", "train_loss_itm": "0.212", "epoch": 17}
{"train_lr": "0.000", "train_loss_mlm": "1.012", "train_loss_ita": "1.652", "train_loss_itm": "0.208", "epoch": 18}
{"train_lr": "0.000", "train_loss_mlm": "1.000", "train_loss_ita": "1.645", "train_loss_itm": "0.203", "epoch": 19}
{"train_lr": "0.000", "train_loss_mlm": "0.989", "train_loss_ita": "1.645", "train_loss_itm": "0.199", "epoch": 20}
{"train_lr": "0.000", "train_loss_mlm": "0.977", "train_loss_ita": "1.639", "train_loss_itm": "0.195", "epoch": 21}
{"train_lr": "0.000", "train_loss_mlm": "0.966", "train_loss_ita": "1.639", "train_loss_itm": "0.191", "epoch": 22}
{"train_lr": "0.000", "train_loss_mlm": "0.957", "train_loss_ita": "1.628", "train_loss_itm": "0.188", "epoch": 23}
{"train_lr": "0.000", "train_loss_mlm": "0.949", "train_loss_ita": "1.634", "train_loss_itm": "0.184", "epoch": 24}
{"train_lr": "0.000", "train_loss_mlm": "0.942", "train_loss_ita": "1.635", "train_loss_itm": "0.181", "epoch": 25}
{"train_lr": "0.000", "train_loss_mlm": "0.935", "train_loss_ita": "1.634", "train_loss_itm": "0.179", "epoch": 26}
{"train_lr": "0.000", "train_loss_mlm": "0.930", "train_loss_ita": "1.639", "train_loss_itm": "0.177", "epoch": 27}
{"train_lr": "0.000", "train_loss_mlm": "0.925", "train_loss_ita": "1.629", "train_loss_itm": "0.175", "epoch": 28}
{"train_lr": "0.000", "train_loss_mlm": "0.921", "train_loss_ita": "1.634", "train_loss_itm": "0.173", "epoch": 29}

Many thanks!

Hello, I wonder is the log for 14M data or 4M data? I got a much higher MLM loss with 4M pretraining.

commented

It comes from 4M data

It comes from 4M data

Thanks for your reply. I got a log like this
{"train_lr": "0.000", "train_loss_mlm": "2.457", "train_loss_ita": "0.915", "train_loss_itm": "0.464", "epoch": 0}
hhhhh, let me check my code.

It comes from 4M data
I checked my code and all is the same. I don't have all the data bacause some urls are lossed. Will it get so big gap?

It comes from 4M data
I checked my code and all is the same. I don't have all the data bacause some urls are lossed. Will it get so big gap?

and the log magnitude is so different from Junnan showed in salesforce/ALBEF#71

commented

are you talking about ITA loss or MLM loss?

are you talking about ITA loss or MLM loss?

the defination of ITA loss is different and it's naturally different, I'm confused about the rapid descent of MLM loss and why I get a quite low ITA loss. Is it from of the difference in datasets?

commented

dataset shouldn't be the root cause. My 4M dataset also misses some urls

dataset shouldn't be the root cause. My 4M dataset also misses some urls

Oh , I find the mistake. Thanks for your reply. I'll try it again.

dataset shouldn't be the root cause. My 4M dataset also misses some urls

Oh , I find the mistake. Thanks for your reply. I'll try it again.

And I still cannot get a right mlm with other losses are about equal. Is there any preprocess used for captions?

commented

Have you tested your model performance? for example the zero-shot performance?