damian0815 / compel

A prompting enhancement library for transformers-type text embedding systems

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Compel ails with SDXL Refiner

hosseinsarshar opened this issue · comments

Hi,

I used the example for SDXL for the base model and it works perfectly fine.
However, it fails with refiner or image to image. I get this error:

File /anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/torch/utils/_contextlib.py:115, in context_decorator..decorate_context(*args, **kwargs)
    112 @functools.wraps(func)
    113 def decorate_context(*args, **kwargs):
    114     with ctx_factory():
--> 115         return func(*args, **kwargs)

File /anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_img2img.py:965, in StableDiffusionXLImg2ImgPipeline.__call__(self, prompt, prompt_2, image, strength, num_inference_steps, denoising_start, denoising_end, guidance_scale, negative_prompt, negative_prompt_2, num_images_per_prompt, eta, generator, latents, prompt_embeds, negative_prompt_embeds, pooled_prompt_embeds, negative_pooled_prompt_embeds, output_type, return_dict, callback, callback_steps, cross_attention_kwargs, guidance_rescale, original_size, crops_coords_top_left, target_size, aesthetic_score, negative_aesthetic_score)
    963 # predict the noise residual
    964 added_cond_kwargs = {"text_embeds": add_text_embeds, "time_ids": add_time_ids}
--> 965 noise_pred = self.unet(
    966     latent_model_input,
    967     t,
...
File /anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
    113 def forward(self, input: Tensor) -> Tensor:
--> 114     return F.linear(input, self.weight, self.bias)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (154x768 and 1280x768)
Output is truncated. View as a [scrollable element](command:cellOutput.enableScrolling?0b6b9433-f8e7-4b11-bbe3-69d63d9f19c6) or open in a [text editor](command:workbench.action.openLargeOutput?0b6b9433-f8e7-4b11-bbe3-69d63d9f19c6). Adjust cell output [settings](command:workbench.action.openSettings?%5B%22%40tag%3AnotebookOutputLayout%22%5D)...

And this is how I get the pipe_sdxl_refiner:

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print('device', device)
pipe_sdxl_refiner = StableDiffusionXLImg2ImgPipeline.from_pretrained(model_id, torch_dtype=torch.float16, variant="fp16", use_safetensors=True).to(device)
pipe_sdxl_refiner.scheduler = EulerAncestralDiscreteScheduler.from_pretrained(model_id, subfolder="scheduler")
pipe_sdxl_refiner.enable_xformers_memory_efficient_attention()

Can you give an example for how you are initializing Compel? This is how I create the object for use with the refiner:

compel = Compel(
    tokenizer=pipe.tokenizer_2,
    text_encoder=pipe.text_encoder_2,
    returned_embeddings_type=ReturnedEmbeddingsType.PENULTIMATE_HIDDEN_STATES_NON_NORMALIZED,
    requires_pooled=True,
)