laksjdjf / IPAdapter-ComfyUI

experimental

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

clip vision encode

cubiq opened this issue · comments

hope you don't mind my asking, why aren't you using the clip vision encode node anymore? Every time there's a change in comfy clipvision the IPAdapter node might break (as it happened recently)

That's a good question.

There are two reasons why I do not use CLIPVisionEncode.

  1. CLIPVisionEncode does not output hidden_states, but IP-Adapter-plus requires it.
  2. IP-Adapter-plus needs a black image for the negative side. I think it is inconvenient for users to prepare black image.

Relevant parts of the code:

with precision_scope(comfy.model_management.get_autocast_device(clip_vision.load_device), torch.float32):
outputs = clip_vision.model(pixel_values=pixel_values, output_hidden_states=True)
if plus:
cond = outputs.hidden_states[-2]
with precision_scope(comfy.model_management.get_autocast_device(clip_vision.load_device), torch.float32):
uncond = clip_vision.model(torch.zeros_like(pixel_values), output_hidden_states=True).hidden_states[-2]
else:
cond = outputs.image_embeds
uncond = torch.zeros_like(cond)

that's interesting. thanks for your answer