Cannot completely download the coco caption dataset for finetuning VinVL model

Question

Cannot completely download the coco caption dataset for finetuning VinVL model

yaolinli opened this issue 3 years ago · comments

I want to finetune the pretrained vinvl model on the coco captioning downstream task and follow https://github.com/microsoft/Oscar/blob/master/VinVL_DOWNLOAD.md to download the dataset.
However, when I use the command path/to/azcopy copy https://biglmdiag.blob.core.windows.net/vinvl/datasets/coco_caption ./coco_caption --recursive , the "train.feature.tsv" is missing

I can only partly download the following contents

hasontung1999 · Answer 1 · Wed Nov 03 2021 18:06:06 GMT+0800 (China Standard Time)

@yaolinli
You can try getting dataset from this link: https://biglmdiag.blob.core.windows.net/oscar/datasets/coco_caption.zip
If you cannot download it by using azcopy, try using !wget command in Google Colab.
However, COCO misses some files in this link, too. Just download it and fill whatever files it misses.

yaolinli · Answer 2 · Thu Nov 04 2021 19:54:19 GMT+0800 (China Standard Time)

@yaolinli You can try getting dataset from this link: https://biglmdiag.blob.core.windows.net/oscar/datasets/coco_caption.zip If you cannot download it by using azcopy, try using !wget command in Google Colab. However, COCO misses some files in this link, too. Just download it and fill whatever files it misses.

Thank you very much! I download the whole zip file successfully with link https://biglmdiag.blob.core.windows.net/oscar/datasets/coco_caption.zip

yaolinli · Answer 3 · Thu Nov 04 2021 21:20:10 GMT+0800 (China Standard Time)

I think the dataset from link https://biglmdiag.blob.core.windows.net/oscar/datasets/coco_caption.zip may be different from the https://biglmdiag.blob.core.windows.net/vinvl/datasets/coco_caption . Because if I do inference of the released vinvl /coco_captioning_base_scst/checkpoint-15-66405 on the test set from the second link, the results are the same as what reported in the paper. But when I do inference on the test set from the first link, the results are wrong as follows:
INFO:vlpretrain:evaluation result: {'Bleu_1': 0.3754352697810658, 'Bleu_2': 0.1690062414796108, 'Bleu_3': 0.08197754771485882, 'Bleu_4': 0.04221742607217998, 'METEOR': 0.09355317287051836, 'ROUGE_L': 0.30101993017675194, 'CIDEr': 0.03730488300346641, 'SPICE': 0.02254211667076489}

So I still want to know where to completely download the vinvl fine-tuning dataset( train.feature.tsv) of coco caption?