Multilingual Models with ONNX: Missing Python Package

Question

Multilingual Models with ONNX: Missing Python Package

mnbucher opened this issue a year ago · comments

Martin Juan José Bucher commented a year ago

I'm trying to use a multilingual model with the ONNX runtime with the prebuilt docker image from the hub. However, when using a custom multilingual model, e.g. "M-CLIP/XLM-Roberta-Large-Vit-B-16Plus", I get an exception in the docker container saying that:

ModuleNotFoundError: No module named 'transformers'

I quickly checked the source code in this repo, and indeed the code tries to load the Tokenizer from Huggingface when using a multilingual model. This package is however not provided through the default docker image.

How could I install this package while using docker compose? I think the deployed docker images need to be adapted to include the transformers library as well..

The specific code causing issues is the following snippet from server/clip_server/model/tokenization.py

class Tokenizer:
    def __init__(self, name: str, **kwargs):
        self._name = name
        if name in _MULTILINGUALCLIP_MODELS:
            import transformers

            self._tokenizer = transformers.AutoTokenizer.from_pretrained(name)

I could fix this by using the following Dockerfile:

FROM jinaai/clip-server:master-onnx

# Install the required pip package
RUN pip install transformers

CMD [ "/cas/server/custom-config.yml" ]