salesforce / BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LAION 115M dataset has 11164.tar?

jacob-kang opened this issue · comments

Hi.
Thanks for your great works.

I just downloaded LAION115M Filtered web caption dataset.

In the description of MiniGPT-4,
there is a decription about LAION115M.

Then, set up the LAION dataset loading path in here at Line 5 as ${MINIGPT4_DATASET}/laion/laion_dataset/{00000..10488}.tar

So it should be 00000 ~ 10488.tar.
But I got 00000 ~ 11164.tar.

Is it wrong? or due to I downloaded Filtered web caption not Filtered synthetic caption by ViT-L ?

Thanks.

Kind regards