asagi4 / comfyui-prompt-control

ComfyUI nodes for prompt editing and LoRA control

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BUG: weights inside CLIP_L(...) don't parse correctly

asagi4 opened this issue · comments

Thank you!

clip_l and clip_g gets torch concatenated as seen in https://github.com/comfyanonymous/ComfyUI/blob/6d281b4ff4ad3918a4f3b4ca4a8b547a2ba3bf80/comfy/sdxl_clip.py#L52-L56
so which weights do you use for the final conditioning?

@mizukarada as far as I can tell that encoding happens outside anything my nodes touch.

The logic is the same as in CLIPTextEncodeSDXL

The node just encodes the l and g tokens and then calls clip.encode_from_tokens (or its equivalent from ADV_CLIP_emb if that's in use). What that does is up to ComfyUI.

@asagi4 For example,

clip_l: cat AND dog :1.2

clip_g: apple AND orange :1.3

clip_l and clip_g gets merged into one we'll call emb

what is emb's weight? Is it 1.2 or 1.3? or the average of the two? mind you, there can also be multiple weights in one text.

@mizukarada I don't think you can do that with my nodes; AND combining of prompt is processed after any clip_l / clip_g separation, since the l / g distinction happens at the token level and disappears once they've been encoded into tensors (though there's the "pooled" vs. non-pooled tensors, but my nodes basically treat them identically)

AND inside the CLIP_L function doesn't make sense. CLIP_L(foo AND bar) will essentially parse CLIP_L(foo and bar) as two prompts