AILab-CVC / SEED

Official implementation of SEED-LLaMA (ICLR 2024).

Home Page:https://ailab-cvc.github.io/seed

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

如何获取训练数据?

APiaoG opened this issue · comments

您好!非常感谢您的杰出的开源工作!我想问一下以下训练数据可否可以开源呢?
data_dir:
- dataset/seed_v2_0828/caption/unsplash_cc3m
- dataset/seed_v2_0828/caption/coco
data_dir: /dataset/seed_v2_0828/caption/laion-coco
data_dir: dataset/seed_v2_0828/image_interleaved/mmc4
data_dir: dataset/seed_v2_0828/image_interleaved/obelisc
data_dir: dataset/seed_v2_0828/caption/WebVid-10m
data_dir: dataset/wikipedia_20220301.en
或者是经过src/tools/extract_image_ids_to_torchdata_parallel.py 预处理之前的数据集可否提供一下呢?非常感谢!

由于这些数据的版权不归我们所有,所以我们无法提供下载好的数据集,可以去相应的官网下载这些公开数据集。