Unknown variant Qwen2
dprokhorov17 opened this issue · comments
System Info
linux 64 bit
Information
- Docker
- The CLI directly
Tasks
- An officially supported command
- My own modifications
Reproduction
running docker command:
docker run --name gte-Qwen2-1.5B-instruct --gpus device=1 -p 8080:80 -v ~/.cache/huggingface/hub:/data --pull always ghcr.io/huggingface/text-embeddings-inference:hopper-1.3 --model-id Alibaba-NLP/gte-Qwen2-1.5B-instruct --max-client-batch-size 10000
leads to:
2024-07-01T13:31:20.383693Z INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "Ali****-***/***-*****-*.**-*****uct", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 10000, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "8ca483949630", port: 80, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2024-07-01T13:31:20.384029Z INFO hf_hub: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/root/.cache/huggingface/token"
2024-07-01T13:31:20.530062Z INFO download_pool_config: text_embeddings_core::download: core/src/download.rs:45: Downloading `1_Pooling/config.json`
2024-07-01T13:31:21.199753Z INFO download_new_st_config: text_embeddings_core::download: core/src/download.rs:108: Downloading `config_sentence_transformers.json`
2024-07-01T13:31:21.451810Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:20: Starting download
2024-07-01T13:31:21.451841Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:22: Downloading `config.json`
2024-07-01T13:31:21.702214Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:25: Downloading `tokenizer.json`
2024-07-01T13:31:22.364765Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:52: Downloading `model.safetensors`
2024-07-01T13:31:22.487955Z WARN download_artifacts: text_embeddings_core::download: core/src/download.rs:55: Could not download `model.safetensors`: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct/resolve/main/model.safetensors)
2024-07-01T13:31:22.487993Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:60: Downloading `model.safetensors.index.json`
2024-07-01T13:31:22.738031Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:82: Downloading `model-00002-of-00002.safetensors`
2024-07-01T13:31:44.545567Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:82: Downloading `model-00001-of-00002.safetensors`
2024-07-01T13:32:39.816838Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:39: Model artifacts downloaded in 78.365019135s
2024-07-01T13:32:40.155131Z WARN tokenizers::tokenizer::serialization: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|endoftext|>' was expected to have ID '151643' but was given ID 'None'
2024-07-01T13:32:40.155165Z WARN tokenizers::tokenizer::serialization: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|im_start|>' was expected to have ID '151644' but was given ID 'None'
2024-07-01T13:32:40.155170Z WARN tokenizers::tokenizer::serialization: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|im_end|>' was expected to have ID '151645' but was given ID 'None'
2024-07-01T13:32:40.155528Z INFO text_embeddings_router: router/src/lib.rs:175: Maximum number of tokens per request: 32768
2024-07-01T13:32:40.174410Z INFO text_embeddings_core::tokenization: core/src/tokenization.rs:26: Starting 64 tokenization workers
2024-07-01T13:32:42.424118Z INFO text_embeddings_router: router/src/lib.rs:226: Starting model backend
Error: Could not create backend
Caused by:
Could not start backend: Model is not supported
Caused by:
unknown variant `qwen2`, expected one of `bert`, `xlm-roberta`, `camembert`, `roberta`, `distilbert`, `nomic_bert`, `mistral`, `new` at line 19 column 23
* Terminal will be reused by tasks, press any key to close it.
Expected behavior
Qwen2 gte model loaded and served successfully