Some missing *_cls_prob.npy files in Flickr30k
shrutijpalaskar opened this issue · comments
Hi Luowei :)
I am trying to finetune a model on Flickr30k using https://github.com/LuoweiZhou/VLP#flickr30k-captions
and the training begins successfully. But in certain batches, there are missing *_cls_prob.npy files in the $DATA_ROOT/flickr30k/region_feat_gvd_wo_bgd/trainval data folder.
I downloaded the Flirck30k data twice and the issue persists. Do you have any idea why this might be happening?
Any pointers for how to best fix this?
Thanks and best,
Shruti
Hi Shruti!
Thanks for your interest in our work. I just checked and it turned out 6 files are missing in our zip file during transferring (some *cls_prob.npy and others *.npy). I have attached all the missing files (and their counterparts). These are the same files you can obtain using our open-sourced detector: https://github.com/LuoweiZhou/detectron-vlp
Besides, you may optionally merge all the *.npy files into one or a few *.h5 files for faster access, like in CC/COCO: https://github.com/LuoweiZhou/VLP/blob/master/vlp/seq2seq_loader.py#L325
Best,
Luowei
This is working, thanks a lot, Luowei for getting back so quickly! :)
Yes, using the *.h5 to speed up data access as well now.