facebookresearch / stable_signature

Official implementation of the paper "The Stable Signature Rooting Watermarks in Latent Diffusion Models"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

finetuning ldm decoder: noisy output

rahimentezari opened this issue · comments

Hi
I wanted to finetune the ldm decoder and have two problems

  1. there are some missing parameters in the finetune_ldm_decoder compared to https://justpaste.it/cse0x, for example lambda_mse": 0.5, "lambda_lpips": 1. Should we remove them from param list?
  2. As training goes, even after 6K iterations, (you use only 100 iters right?), I get noisy outputs:
    6000_train_d0
    6000_train_d0
    6000_train_orig
    6000_train_orig
    6000_train_w
    6000_train_w
    Here are my configs:

python finetune_ldm_decoder.py --num_keys 1 \ --ldm_config configs/v2-inference.yaml \ --ldm_ckpt v2-1_512-ema-pruned.ckpt \ --msg_decoder_path dec_48b.pth \ --decoder_depth 8 \ --decoder_channels 64 \ --loss_i "watson-vgg" \ --loss_w "bce" \ --lambda_i 0.2 \ --lambda_w 1.0 \ --optimizer "AdamW,lr=5e-4" \ --train_dir coco2014/train2014 \ --val_dir coco2014/test2014 \ --steps 10000 \ --warmup_steps 100 \ --batch_size 16

  1. If I want to change the decoder, to another one, can I still use the same hidden network trained? I give it a try with another decoder with z_channel=8 and I am getting noisy train_w images (purple images)
    Train [ 760/1000] eta: 0:01:45 iteration: 750.000000 (380.000000) loss: 0.190190 (0.414728) loss_w: 0.051896 (0.189214) loss_i: 0.692773 (1.127573) psnr: 25.542080 (inf) bit_acc_avg: 1.000000 (0.927698) word_acc_avg: 1.000000 (0.459921) lr: 0.000076 (0.000321) time: 0.428403 data: 0.000091 max mem: 42627

Hi,

  1. yes
  2. 6000_train_d0 are images decoded from the original decoder (D_o) (this one is not changed during optim), so the issue is not in the fine-tuning I would say. Does it only happen after some fine-tuning steps? Can you try to to encode decode and see what the images look like?
  3. Yes, you should be able to switch decoders as they are independent of the extractor (in the paper, we did fine-tune other decoders, like the one used for inpainting or for SR, which differ from the original one).

You can also share the full logs and code to reproduce.