Which "Stable Diffusion Image Variations Model" you fine-tuned?
BlingHe opened this issue · comments
You may find details here: https://huggingface.co/lambdalabs/sd-image-variations-diffusers
You may find details here: https://huggingface.co/lambdalabs/sd-image-variations-diffusers
I noticed that the "in_channels" of this image variations model is 4. But, your unet model needs 8 in_channels for additional "image_latent". How did your modified unet model be trained? Joint training with domain switcher and cross-domain attention or pre-training before training other modules?
Thanks in advance!