[Enhancement]: ops.image_text_embedding.clip use local model path

Question

[Enhancement]: ops.image_text_embedding.clip use local model path

MrRace opened this issue a year ago · comments

Is there an existing issue for this?

I have searched the existing issues

What would you like to be added?

ops.image_text_embedding.clip need to support local model path. Maybe like checkpoint_path in https://towhee.io/sentence-embedding/transformers

Why is this needed?

I have downloaded many models in a public directory. for example I download clip-vit-base-patch32 from https://huggingface.co/openai/clip-vit-base-patch32 and put the model in /home/model_zoo/CLIP/.
I try the official example : text_image_search/1_build_text_image_search_engine. Therefore I want to modify ops.image_text_embedding.clip(model_name='clip_vit_base_patch16', modality='image') to ops.image_text_embedding.clip(model_name='/home/model_zoo/CLIP/clip-vit-base-patch32', modality='image') which use local model path, However it fails

Anything else?

No response

junjiejiangjjj commented a year ago

@wxywb

junjiejiangjjj · Answer 1 · Fri Jul 14 2023 11:26:09 GMT+0800 (China Standard Time)

https://towhee.io/image-text-embedding/clip/src/branch/main/clip.py#L95
Clip now supports checkpoint_path, make sure the locally cached clip code is the latest version.

JaonLiu · Answer 2 · Fri Jul 14 2023 14:21:34 GMT+0800 (China Standard Time)

https://towhee.io/image-text-embedding/clip/src/branch/main/clip.py#L95 Clip now supports checkpoint_path, make sure the locally cached clip code is the latest version.
@junjiejiangjjj
When set checkpoint_path only：

TypeError: clip() missing 1 required positional argument: 'model_name'

From the source code:

real_name = self._configs()[model_name]

I have to set model_name. When set model_name it will download from network.
From https://towhee.io/image-text-embedding/clip/src/branch/main/clip.py#L41

wxywb · Answer 3 · Fri Jul 14 2023 15:26:27 GMT+0800 (China Standard Time)

checkpoint_path should be your custom trained weights, you also need to specify the model_name to define the model architecture which be compatible with your weights. you can use the model name clip_vit_base_patch32, and checkpoint_path should be your weights file path.

JaonLiu · Answer 4 · Fri Jul 14 2023 15:51:40 GMT+0800 (China Standard Time)

checkpoint_path should be your custom trained weights, you also need to specify the model_name to define the model architecture which be compatible with your weights. you can use the model name clip_vit_base_patch32, and checkpoint_path should be your weights file path.
@wxywb @junjiejiangjjj
from https://towhee.io/image-text-embedding/clip/src/branch/main/clip.py#L157
we can see the model_name can not beyond the four names. When use checkpoint_path it seems does not need use model_name anymore.

wxywb · Answer 5 · Fri Jul 14 2023 16:08:25 GMT+0800 (China Standard Time)

I realized the problem, because the operator's interface initially designed with non-huggingface operators. And we didn't consider the huggingface style model-loading usage. We will make this huggingface style model-loading in soon, thanks for your advice.

junjiejiangjjj · Answer 6 · Mon Jul 24 2023 19:55:04 GMT+0800 (China Standard Time)

seems can fix this problem @wxywb

def create_model(model_name, checkpoint_path, device):
    if checkpoint_path:
        config = AutoConfig.from_pretrained(model_name)
        model = AutoModel.from_pretrained(checkpoint_path, config=config)
    else:
        model = AutoModel.from_pretrained(model_name)
    if hasattr(model, 'pooler') and model.pooler:
        model.pooler = None
    model.to(device)
    model.eval()
    return model

Jael Gu · Answer 7 · Tue Jul 25 2023 10:58:19 GMT+0800 (China Standard Time)

If the local model is a directory containing config.json and weights by saved_pretrained(), you can just use the path to local model directory as model_name.