kakaobrain / coyo-dataset

COYO-700M: Large-scale Image-Text Pair Dataset

Home Page:https://kakaobrain.com/contents?contentId=7eca73e3-3089-43cb-b701-332e8a1743fd

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can we use this dataset without downloading?

sanyalsunny111 opened this issue · comments

Dear Authors,

Thank you for the great effort in curating this very interesting dataset. I am wondering if there is a way one can use this dataset without downloading it say with/without huggingface hub? if yes please let me know how?

Hi, @sanyalsunny111

Thank you for your interest in our dataset.

Images included in the COYO dataset must be individually downloaded to be used for training. Due to licensing issues, we cannot provide any files with images (such as webdataset or tfrecord).

If this is not the answer you want, please reopen the issue.
Thank you.