finetuning ldm decoder: noisy output
rahimentezari opened this issue · comments
rahimentezari commented
Hi
I wanted to finetune the ldm decoder and have two problems
- there are some missing parameters in the finetune_ldm_decoder compared to https://justpaste.it/cse0x, for example lambda_mse": 0.5, "lambda_lpips": 1. Should we remove them from param list?
- As training goes, even after 6K iterations, (you use only 100 iters right?), I get noisy outputs:
6000_train_d0
6000_train_orig
6000_train_w
Here are my configs:
python finetune_ldm_decoder.py --num_keys 1 \ --ldm_config configs/v2-inference.yaml \ --ldm_ckpt v2-1_512-ema-pruned.ckpt \ --msg_decoder_path dec_48b.pth \ --decoder_depth 8 \ --decoder_channels 64 \ --loss_i "watson-vgg" \ --loss_w "bce" \ --lambda_i 0.2 \ --lambda_w 1.0 \ --optimizer "AdamW,lr=5e-4" \ --train_dir coco2014/train2014 \ --val_dir coco2014/test2014 \ --steps 10000 \ --warmup_steps 100 \ --batch_size 16
- If I want to change the decoder, to another one, can I still use the same hidden network trained? I give it a try with another decoder with z_channel=8 and I am getting noisy train_w images (purple images)
Train [ 760/1000] eta: 0:01:45 iteration: 750.000000 (380.000000) loss: 0.190190 (0.414728) loss_w: 0.051896 (0.189214) loss_i: 0.692773 (1.127573) psnr: 25.542080 (inf) bit_acc_avg: 1.000000 (0.927698) word_acc_avg: 1.000000 (0.459921) lr: 0.000076 (0.000321) time: 0.428403 data: 0.000091 max mem: 42627
Pierre Fernandez commented
Hi,
- yes
6000_train_d0
are images decoded from the original decoder (D_o
) (this one is not changed during optim), so the issue is not in the fine-tuning I would say. Does it only happen after some fine-tuning steps? Can you try to to encode decode and see what the images look like?- Yes, you should be able to switch decoders as they are independent of the extractor (in the paper, we did fine-tune other decoders, like the one used for inpainting or for SR, which differ from the original one).
You can also share the full logs and code to reproduce.