Error when loading the dataset with Release 1.0.0

Question

Error when loading the dataset with Release 1.0.0

hsleiman1 opened this issue 7 months ago · comments

hsleiman1 commented 7 months ago

Hello,

I tried to train using the Release 1.0.0 but I have this issue when loading the dataset, and then the training freezes.

The error: list index out of range domain B data loading for /home/........./snowy/imgs/0ce5101a-36562e6a.jpg

The same command works well with the code version from 1st August 2023.

Are their some options to change for using Release 1.0.0?

Thank you

Emmanuel Benazera · Answer 1 · Mon Oct 23 2023 23:24:02 GMT+0800 (China Standard Time)

Have you tried adding or removing --data_relative_paths depending on your dataset ? and --data_sanitize_paths if your dataset is missing images ?

hsleiman1 · Answer 2 · Tue Oct 24 2023 16:45:04 GMT+0800 (China Standard Time)

I did not use --data_relative_paths since I am using absolute paths.

I will check with sanithize path.

The question is why the same configuration works well, with no warning or errors, with the previous code but not with the release 1.0.0. Thanks!

Emmanuel Benazera · Answer 3 · Tue Oct 24 2023 16:50:39 GMT+0800 (China Standard Time)

There are additional checks on the data and many changes over the dataloaders structure. You may want to give more details here, such as the exact inner structure of the dataset and exact command line.

hsleiman1 · Answer 4 · Tue Oct 24 2023 16:59:48 GMT+0800 (China Standard Time)

Sure, the command is as follows:

python3 train.py --dataroot /home/ubuntu/clear2snowy/ --checkpoints_dir /home/ubuntu/checkpoints --name clear2snowy --output_display_freq 50 --output_print_freq 50 --train_G_lr 0.0002 --train_D_lr 0.0001 --data_crop_size 256 --data_load_size 512 --data_dataset_mode unaligned_labeled_mask_online --model_type cut --train_batch_size 14 --train_iter_size 2 --model_input_nc 3 --model_output_nc 3 --f_s_net segformer --f_s_config_segformer models/configs/segformer/segformer_config_b0.json --train_mask_f_s_B --f_s_semantic_nclasses 11 --G_netG segformer_attn_conv --G_config_segformer models/configs/segformer/segformer_config_b0.json --data_online_creation_crop_size_A 512 --data_online_creation_crop_delta_A 64 --data_online_creation_mask_delta_A 64 --data_online_creation_crop_size_B 512 --data_online_creation_crop_delta_B 64 --dataaug_D_noise 0.01 --data_online_creation_mask_delta_B 64 --alg_cut_nce_idt --train_sem_use_label_B --D_netDs projected_d basic vision_aided --D_proj_interp 512 --D_proj_network_type vitsmall --train_G_ema --G_padding_type reflect --train_optim adam --dataaug_no_rotate --train_sem_idt --model_multimodal --train_mm_nz 16 --G_netE resnet_256 --f_s_class_weights 1 10 10 1 5 5 10 10 30 50 50 --gpu_id 0,1,2,3 --train_semantic_mask --output_display_aim_server 127.0.0.1 --output_display_visdom_port 8501

The dataset structure is as follows:
├── clear
│   ├── bbox
│   └── imgs
├── snowy
│   ├── bbox
│   └── imgs
├── trainA
│   └── paths.txt
└── trainB
└── paths.txt

Whereas the files "paths.txt" include for each image, the name of their bbox file, example:
/home/ubuntu/clear2snowy/snowy/imgs/00091078-7cff8ea6.jpg /home/ubuntu/clear2snowy/snowy/bbox/00091078-7cff8ea6.txt

Emmanuel Benazera · Answer 5 · Tue Oct 24 2023 18:13:07 GMT+0800 (China Standard Time)

error: list index out of range domain B data loading for /home/........./snowy/imgs/0ce5101a-36562e6a.jpg

After investigation this is because your number of semantic classes is wrong and the class 11 cannot be found in the list of mask delta values. You need to account for the background class and set 12 classes.

You can easily see this by looking at bbox/0ce5101a-36562e6a.txt where a bbox has class 11: 11 573 214 762 311.

You need to use --f_s_semantic_nclasses 12 and --f_s_class_weights 1 10 10 1 5 5 10 10 30 50 50 50 (here value 50 is given to class 12, you may want to modify it as needed).