microsoft / Oscar

Oscar and VinVL

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot completely download the coco caption dataset for finetuning VinVL model

yaolinli opened this issue · comments

I want to finetune the pretrained vinvl model on the coco captioning downstream task and follow https://github.com/microsoft/Oscar/blob/master/VinVL_DOWNLOAD.md to download the dataset.
However, when I use the command path/to/azcopy copy https://biglmdiag.blob.core.windows.net/vinvl/datasets/coco_caption ./coco_caption --recursive , the "train.feature.tsv" is missing
image

I can only partly download the following contents
image

@yaolinli
You can try getting dataset from this link: https://biglmdiag.blob.core.windows.net/oscar/datasets/coco_caption.zip
If you cannot download it by using azcopy, try using !wget command in Google Colab.
However, COCO misses some files in this link, too. Just download it and fill whatever files it misses.

@yaolinli You can try getting dataset from this link: https://biglmdiag.blob.core.windows.net/oscar/datasets/coco_caption.zip If you cannot download it by using azcopy, try using !wget command in Google Colab. However, COCO misses some files in this link, too. Just download it and fill whatever files it misses.

Thank you very much! I download the whole zip file successfully with link https://biglmdiag.blob.core.windows.net/oscar/datasets/coco_caption.zip

I think the dataset from link https://biglmdiag.blob.core.windows.net/oscar/datasets/coco_caption.zip may be different from the https://biglmdiag.blob.core.windows.net/vinvl/datasets/coco_caption . Because if I do inference of the released vinvl /coco_captioning_base_scst/checkpoint-15-66405 on the test set from the second link, the results are the same as what reported in the paper. But when I do inference on the test set from the first link, the results are wrong as follows:
INFO:vlpretrain:evaluation result: {'Bleu_1': 0.3754352697810658, 'Bleu_2': 0.1690062414796108, 'Bleu_3': 0.08197754771485882, 'Bleu_4': 0.04221742607217998, 'METEOR': 0.09355317287051836, 'ROUGE_L': 0.30101993017675194, 'CIDEr': 0.03730488300346641, 'SPICE': 0.02254211667076489}

So I still want to know where to completely download the vinvl fine-tuning dataset( train.feature.tsv) of coco caption?