jina-ai / clip-as-service

🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP

Home Page:https://clip-as-service.jina.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multilingual Models with ONNX: Missing Python Package

mnbucher opened this issue · comments

I'm trying to use a multilingual model with the ONNX runtime with the prebuilt docker image from the hub. However, when using a custom multilingual model, e.g. "M-CLIP/XLM-Roberta-Large-Vit-B-16Plus", I get an exception in the docker container saying that:

ModuleNotFoundError: No module named 'transformers'

I quickly checked the source code in this repo, and indeed the code tries to load the Tokenizer from Huggingface when using a multilingual model. This package is however not provided through the default docker image.

How could I install this package while using docker compose? I think the deployed docker images need to be adapted to include the transformers library as well..

The specific code causing issues is the following snippet from server/clip_server/model/tokenization.py

class Tokenizer:
    def __init__(self, name: str, **kwargs):
        self._name = name
        if name in _MULTILINGUALCLIP_MODELS:
            import transformers

            self._tokenizer = transformers.AutoTokenizer.from_pretrained(name)

I could fix this by using the following Dockerfile:

FROM jinaai/clip-server:master-onnx

# Install the required pip package
RUN pip install transformers

CMD [ "/cas/server/custom-config.yml" ]