pharmapsychotic / clip-interrogator

Image to prompt with BLIP and CLIP

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RuntimeError: mixed dtype (CPU): expect parameter to have scalar type of Float

veecam opened this issue · comments

commented
Loading BLIP model...
Downloading vocab.txt: 100%|██████████████████████████████████████████████████████████| 226k/226k [00:00<00:00, 245kB/s]
Downloading tokenizer_config.json: 100%|█████████████████████████████████████████████| 28.0/28.0 [00:00<00:00, 84.1kB/s]
Downloading config.json: 100%|█████████████████████████████████████████████████████████| 570/570 [00:00<00:00, 1.34MB/s]
100%|█████████████████████████████████████████████████████████████████████████████| 1.66G/1.66G [1:12:37<00:00, 410kB/s]
load checkpoint from https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_large_caption.pth
Loading CLIP model...
Downloading (…)ip_pytorch_model.bin: 100%|█████████████████████████████████████████| 3.94G/3.94G [03:25<00:00, 19.2MB/s]
Preprocessing artists:   0%|                                                                      | 0/5 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/usr/local/clip/app.py", line 42, in <module>
    ci = Interrogator(config)
  File "/usr/local/clip/clip-interrogator/clip_interrogator/clip_interrogator.py", line 70, in __init__
    self.load_clip_model()
  File "/usr/local/clip/clip-interrogator/clip_interrogator/clip_interrogator.py", line 105, in load_clip_model
    self.artists = LabelTable(artists, "artists", self.clip_model, self.tokenize, config)
  File "/usr/local/clip/clip-interrogator/clip_interrogator/clip_interrogator.py", line 265, in __init__
    text_features = clip_model.encode_text(text_tokens)
  File "/usr/local/lib/python3.10/dist-packages/open_clip/model.py", line 224, in encode_text
    x = self.transformer(x, attn_mask=self.attn_mask)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/open_clip/transformer.py", line 321, in forward
    x = r(x, attn_mask=attn_mask)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/open_clip/transformer.py", line 242, in forward
    x = q_x + self.ls_1(self.attention(q_x=self.ln_1(q_x), k_x=k_x, v_x=v_x, attn_mask=attn_mask))
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/open_clip/transformer.py", line 18, in forward
    x = F.layer_norm(x.to(torch.float32), self.normalized_shape, self.weight, self.bias, self.eps)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 2515, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: mixed dtype (CPU): expect parameter to have scalar type of Float

What do I do? I have tried it on win and linux and got this error. but it work on colab.