Inconsistent parse errors when using .blend() incorrectly

Question

Inconsistent parse errors when using .blend() incorrectly

JeLuF opened this issue a year ago · comments

This prompt fails with a parse error: ("cow", "fish").blend(0.5,0.5) floating above the ocean

This prompt works: A ("cow", "fish").blend(0.5,0.5) floating above the ocean

This is the code used to test this:

from diffusers import StableDiffusionPipeline
from compel import Compel

pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
compel = Compel(tokenizer=pipeline.tokenizer, text_encoder=pipeline.text_encoder)

prompts = ['("cow", "fish").blend(0.5,0.5) floating above the ocean']
prompt_embeds = compel(prompts)
images = pipeline(prompt_embeds=prompt_embeds).images

images[0].save("image0.jpg")

This is the stack trace:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ D:\2.35\dev\ctest.py:8 in <module>                                                               │
│                                                                                                  │
│    5 compel = Compel(tokenizer=pipeline.tokenizer, text_encoder=pipeline.text_encoder)           │
│    6                                                                                             │
│    7 prompts = ['("cow", "fish").blend(0.5,0.5) floating above the ocean']                       │
│ ❱  8 prompt_embeds = compel(prompts)                                                             │
│    9 images = pipeline(prompt_embeds=prompt_embeds).images                                       │
│   10                                                                                             │
│   11 images[0].save("image0.jpg")                                                                │
│                                                                                                  │
│ D:\2.35\dev\stable-diffusion\env\lib\site-packages\torch\utils\_contextlib.py:115 in             │
│ decorate_context                                                                                 │
│                                                                                                  │
│   112 │   @functools.wraps(func)                                                                 │
│   113 │   def decorate_context(*args, **kwargs):                                                 │
│   114 │   │   with ctx_factory():                                                                │
│ ❱ 115 │   │   │   return func(*args, **kwargs)                                                   │
│   116 │                                                                                          │
│   117 │   return decorate_context                                                                │
│   118                                                                                            │
│                                                                                                  │
│ D:\2.35\dev\stable-diffusion\env\lib\site-packages\compel\compel.py:135 in __call__              │
│                                                                                                  │
│   132 │   │   cond_tensor = []                                                                   │
│   133 │   │   pooled = []                                                                        │
│   134 │   │   for text_input in text:                                                            │
│ ❱ 135 │   │   │   output = self.build_conditioning_tensor(text_input)                            │
│   136 │   │   │                                                                                  │
│   137 │   │   │   if self.requires_pooled:                                                       │
│   138 │   │   │   │   cond_tensor.append(output[0])                                              │
│                                                                                                  │
│ D:\2.35\dev\stable-diffusion\env\lib\site-packages\compel\compel.py:111 in                       │
│ build_conditioning_tensor                                                                        │
│                                                                                                  │
│   108 │   │   Build a conditioning tensor by parsing the text for Compel syntax, constructing    │
│   109 │   │   building a conditioning tensor from that Conjunction.                              │
│   110 │   │   """                                                                                │
│ ❱ 111 │   │   conjunction = self.parse_prompt_string(text)                                       │
│   112 │   │   conditioning, _ = self.build_conditioning_tensor_for_conjunction(conjunction)      │
│   113 │   │                                                                                      │
│   114 │   │   if self.requires_pooled:                                                           │
│                                                                                                  │
│ D:\2.35\dev\stable-diffusion\env\lib\site-packages\compel\compel.py:158 in parse_prompt_string   │
│                                                                                                  │
│   155 │   │   Parse the given prompt string and return a structured Conjunction object that re   │
│   156 │   │   """                                                                                │
│   157 │   │   pp = PromptParser()                                                                │
│ ❱ 158 │   │   conjunction = pp.parse_conjunction(prompt_string)                                  │
│   159 │   │   return conjunction                                                                 │
│   160 │                                                                                          │
│   161 │   def describe_tokenization(self, text: str) -> List[str]:                               │
│                                                                                                  │
│ D:\2.35\dev\stable-diffusion\env\lib\site-packages\compel\prompt_parser.py:330 in                │
│ parse_conjunction                                                                                │
│                                                                                                  │
│   327 │   │   if len(prompt.strip()) == 0:                                                       │
│   328 │   │   │   return Conjunction(prompts=[FlattenedPrompt([('', 1.0)])], weights=[1.0])      │
│   329 │   │                                                                                      │
│ ❱ 330 │   │   root = self.conjunction.parse_string(prompt)                                       │
│   331 │   │   verbose and print(f"'{prompt}' parsed to root", root)                              │
│   332 │   │   #fused = fuse_fragments(parts)                                                     │
│   333 │   │   #print("fused to", fused)                                                          │
│                                                                                                  │
│ D:\2.35\dev\stable-diffusion\env\lib\site-packages\pyparsing\core.py:1141 in parse_string        │
│                                                                                                  │
│   1138 │   │   │   │   raise                                                                     │
│   1139 │   │   │   else:                                                                         │
│   1140 │   │   │   │   # catch and re-raise exception from here, clearing out pyparsing interna  │
│ ❱ 1141 │   │   │   │   raise exc.with_traceback(None)                                            │
│   1142 │   │   else:                                                                             │
│   1143 │   │   │   return tokens                                                                 │
│   1144                                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ParseException: Expected {explicit_conjunction | {[Group:({lora}...)] {blend | Group:([{cross_attention_substitute | lora | attention | Forward: string enclosed in '"' | parenthesized_fragment | free_word |
Suppress:(<SP><TAB><CR><LF>)}]...)} [Group:({lora}...)] StringEnd}}, found 'floating'  (at char 32), (line:1, col:33)

Versions:

compel 2.0.2
diffusers 0.19.3
open-clip-torch 2.20.0

Damian Stewart · Answer 1 · Wed Sep 06 2023 22:09:42 GMT+0800 (China Standard Time)

both of those prompts aren't correct usage of blend(), which can only blend entire prompts, not parts.

to do what you want you should prompt ("a cow floating above the ocean", "a fish floating above the ocean").blend(0.5, 0.5). this is a design decision i made when designing the prompting system to reflect limitations in how CLIP math works. but i guess i need to change the error messages to consistently throw an error at least..