Dataset structure

Question

Dataset structure

tarunn2799 opened this issue 3 years ago · comments

Hi I'm having a little trouble understanding the dataset structure that I should follow in order to be able to train with this package. Is it one parent folder, one folder containing images and one folder containing their text files?
If yes, what should these subfolders be named?

Romain Beaumont · Answer 1 · Tue Sep 07 2021 20:24:56 GMT+0800 (China Standard Time)

https://github.com/Zasder3/train-CLIP#training-with-our-datamodule- any folder name should work, the file names should be the same

Tarun Narayanan · Answer 2 · Thu Sep 09 2021 14:20:57 GMT+0800 (China Standard Time)

Hey, so all images and text files should be in one single folder?

Romain Beaumont · Answer 3 · Thu Sep 09 2021 16:40:52 GMT+0800 (China Standard Time)

No, any subfolder

Tarun Narayanan · Answer 4 · Fri Sep 10 2021 00:08:57 GMT+0800 (China Standard Time)

Does this work
data/images/p1.jpg
and
data/text/p1.txt

Romain Beaumont · Answer 5 · Fri Sep 10 2021 00:27:30 GMT+0800 (China Standard Time)

Yes

…

On Thu, Sep 9, 2021, 18:09 Tarun Narayanan ***@***.***> wrote: Does this work data/images/p1.jpg and data/text/p1.txt — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#19 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAR437SLKJEZ3Z2UMH5FAITUBDL2JANCNFSM5DSHHO4A> .

Tarun Narayanan · Answer 6 · Mon Sep 13 2021 15:10:30 GMT+0800 (China Standard Time)

Hi I prepared my dataset in that structure and I ran the below command
python train.py --model_name RN50 --folder /data/depop/data_org/clip/data/ --batch_size 512 --gpus 1

I'm getting an AssertionError from the cosine_annealing_warmup package for the line
assert warmup_steps < first_cycle_steps

What's happening here? please help me out

Tarun Narayanan · Answer 7 · Mon Sep 13 2021 16:53:52 GMT+0800 (China Standard Time)

Okay so in models/wrapper.py is the warmup_step hardcoded to 2000? My dataset currently is much smaller for the num_training_steps to be bigger than 2000.

singularity014 · Answer 8 · Tue Oct 05 2021 20:39:04 GMT+0800 (China Standard Time)

Hi, the .txt file here contains the a text caption?
Lets say I have to create my pair of image and text caption, could you please tell me if assumption below is correct?

so if I have to Finetune the CLIP model on pair of images and captions then this would work?

data/images/1_german_sheperd.jpg
data/label/1_german_sheperd.txt
data/images/2_german_sheperd.jpg
data/label/2_german_sheperd.txt

where,

1_german_sheperd.txt contains a caption like "A sleeping German shepherd Dog"
2_german_sheperd.txt contains a caption like "An angry barking German shepherd Dog"

Romain Beaumont · Answer 9 · Wed Oct 06 2021 06:08:44 GMT+0800 (China Standard Time)

yes
I'm surprised how much this is confusing people

singularity014 · Answer 10 · Wed Oct 06 2021 11:57:24 GMT+0800 (China Standard Time)

yes I'm surprised how much this is confusing people

Actually, creating a file per caption(or label) , didn't make much sense to me, hence the question.

bk-201jk · Answer 11 · Tue Nov 02 2021 10:11:20 GMT+0800 (China Standard Time)

@tarunn2799 Hi，I would like to know has this problem been solved.

Okay so in models/wrapper.py is the warmup_step hardcoded to 2000? My dataset currently is much smaller for the num_training_steps to be bigger than 2000.

Thanks for your time.

iremonur · Answer 12 · Tue Nov 23 2021 18:05:11 GMT+0800 (China Standard Time)

@tarunn2799 Hi，I would like to know has this problem been solved.

Okay so in models/wrapper.py is the warmup_step hardcoded to 2000? My dataset currently is much smaller for the num_training_steps to be bigger than 2000.

Thanks for your time.

Hi @bk-201jk, I faced the same issue and solved the issue thanks to @ymzhu19eee in the issue #20

bk-201jk · Answer 13 · Tue Nov 23 2021 18:13:08 GMT+0800 (China Standard Time)

@iremonur Thank you very much！And I want to know how many photo in your dataset. And how do you set up your directory structure? What is in txt, or are its contents in the title. I would appreciate it if I could see a set of data in your dataset!!

iremonur · Answer 14 · Wed Nov 24 2021 16:00:02 GMT+0800 (China Standard Time)

I'm planning to prepare a 100k dataset (image-text pairs) for fine-tuning, but first I wanted to see if the code would work by running it with only 3 image-text pairs. The folder structure is as follows:
train-CLIP/data/img/1.png
train-CLIP/data/caption/1.txt
And one of the texts: There is a car on the road.

bk-201jk · Answer 15 · Wed Nov 24 2021 16:04:32 GMT+0800 (China Standard Time)

@iremonur .Thank you very much. If you can run the code with only 3 image-text pairs, please tell me .Thanks again!!