jolibrain / joliGEN

Generative AI Image Toolset with GANs and Diffusion for Real-World Applications

Home Page:https://www.joligen.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A question about training on weather conditions

hsleiman1 opened this issue · comments

Hello,

I ran a training from clear to snowy on BDD100K. After epoch 15, the loss is almost stable. Is this normal?

image

Hi @hsleiman1, in order for us to help you, could you please provide the full command line you used and any relevant information to replicate your problem (python version, which os are you on and which version, pytorch version, gpu type...)?
Thank you!

Also try without multimodal first.

Hello,

The training command is as follows:

python train.py --dataroot datasets/clear2snowy --checkpoints_dir checkpoints --name clear2snowy --output_display_env clear2snowy --output_display_freq 50 --output_print_freq 50 --train_G_lr 0.0002 --train_D_lr 0.0001 --data_crop_size 512 --data_load_size 512 --data_dataset_mode unaligned_labeled_mask_online --model_type cut --train_batch_size 3 --train_iter_size 4 --model_input_nc 3 --model_output_nc 3 --f_s_net segformer --f_s_config_segformer models/configs/segformer/segformer_config_b0.py --train_mask_f_s_B --f_s_semantic_nclasses 11 --G_netG segformer_attn_conv --G_config_segformer models/configs/segformer/segformer_config_b0.json --data_online_creation_crop_size_A 512 --data_online_creation_crop_delta_A 64 --data_online_creation_mask_delta_A 64 --data_online_creation_crop_size_B 512 --data_online_creation_crop_delta_B 64 --dataaug_D_noise 0.01 --data_online_creation_mask_delta_B 64 --alg_cut_nce_idt --train_sem_use_label_B --D_netDs projected_d basic vision_aided --D_proj_interp 512 --D_proj_network_type vitsmall --train_G_ema --G_padding_type reflect --train_optim adam --dataaug_no_rotate --train_sem_idt --model_multimodal --train_mm_nz 16 --G_netE resnet_512 --f_s_class_weights 1 10 10 1 5 5 10 10 30 50 50 --output_display_aim_server 127.0.0.1 --output_display_visdom_port 8501 --gpu_id 0,1,2,3 

I am using 4 nvidia L4 and torch==2.0.1.

Is this information sufficient?

Also try without multimodal first.

Could you please give more details on this? or a link?

Thanks!

Hi, I have tried by removing the --model_multimodal option, these are the current results. Is this better in your opinion? Should I continue the train? Thanks!

image

@hsleiman1 you are missing the --train_semantic_mask option, thus the semantic network is not trained. You can see it on the visdom since there's no f_s loss.

Additionally, it is --f_s_config_segformer models/configs/segformer/segformer_config_b0.json and not .py.

Thank you, I will run with the following configuration and check:

python train.py --dataroot datasets/clear2snowy --checkpoints_dir checkpoints2 --name clear2snowy2 --output_display_env clear2snowy2 --output_display_freq 50 --output_print_freq 50 --train_G_lr 0.0002 --train_D_lr 0.0001 --data_crop_size 512 --data_load_size 512 --data_dataset_mode unaligned_labeled_mask_online --model_type cut --train_batch_size 3 --train_iter_size 4 --model_input_nc 3 --model_output_nc 3 --f_s_net segformer --f_s_config_segformer models/configs/segformer/segformer_config_b0.json --train_semantic_mask --train_mask_f_s_B --f_s_semantic_nclasses 11 --G_netG segformer_attn_conv --G_config_segformer models/configs/segformer/segformer_config_b0.json --data_online_creation_crop_size_A 512 --data_online_creation_crop_delta_A 64 --data_online_creation_mask_delta_A 64 --data_online_creation_crop_size_B 512 --data_online_creation_crop_delta_B 64 --dataaug_D_noise 0.01 --data_online_creation_mask_delta_B 64 --alg_cut_nce_idt --train_sem_use_label_B --D_netDs projected_d basic vision_aided --D_proj_interp 512 --D_proj_network_type vitsmall --train_G_ema --G_padding_type reflect --train_optim adam --dataaug_no_rotate --train_sem_idt --train_mm_nz 16 --G_netE resnet_512 --f_s_class_weights 1 10 10 1 5 5 10 10 30 50 50 --output_display_aim_server 127.0.0.1 --output_display_visdom_port 8501 --gpu_id 0,1,2,3

@hsleiman1 FYI I've tested 3 configurations on 3 runs and they all work for me, i.e. clear2snowy goes as expected, from a visual inspection viewpoint that is.
Tested configurations include using the sam discriminator in addition to all others.

Hello, the results after 65 epochs are as follows, please give me your feedback:

image

We can see in the following examples that the results are worst than the beginning of the training process:

conf1
conf2
conf3

Thank you!

This is not enough information to understand what is happening. You need to look at mask conservation, every D loss, etc... The last image seems almost impossible: G moving to clear weather to content the discriminator whereas it's much easier to do so while remaining in night mode. This may point to a dataset issue, orverfit or something else. Never seen this on bdd100k.

I've put my recent run here: https://www.joligen.com/stuff/bdd100k/test_clear2snowy_0723.tar
You can compare to yours, from options, model inferences, etc... It seems fine after 12 epochs.

Hello,

Thank you for your help. The training looks better now. Here is an example after 37 epochs.

image

The results look better. What algorithm did you use to enhance the resolution of the created images?

You can use the generator for inference on the full size images directly. The generator is either fully convolutional (resnet, mobilenet, unet) or directly integrate an upsampling step (segformer).