intel / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.

Repository from Github https://github.comintel/ipex-llmRepository from Github https://github.comintel/ipex-llm

vllm镜像无法启动

maxsnow opened this issue · comments

由于vllm直接访问huggingface.co
在启动镜像的过程中,huggingface.co无法访问,导致容器无法启动

2025-03-07 18:32:47 File "/usr/lib/python3.11/multiprocessing/process.py", line 108, in run
2025-03-07 18:32:47 self._target(*self._args, **self._kwargs)
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/ipex_llm/vllm/xpu/engine/engine.py", line 242, in run_mp_engine
2025-03-07 18:32:47 raise e # noqa
2025-03-07 18:32:47 ^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/ipex_llm/vllm/xpu/engine/engine.py", line 234, in run_mp_engine
2025-03-07 18:32:47 engine = IPEXLLMMQLLMEngine.from_engine_args(engine_args=engine_args,
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/ipex_llm/vllm/xpu/engine/engine.py", line 221, in from_engine_args
2025-03-07 18:32:47 return super().from_engine_args(engine_args, usage_context, ipc_path)
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/vllm/engine/multiprocessing/engine.py", line 114, in from_engine_args
2025-03-07 18:32:47 engine_config = engine_args.create_engine_config(usage_context)
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/vllm/engine/arg_utils.py", line 1066, in create_engine_config
2025-03-07 18:32:47 model_config = self.create_model_config()
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/vllm/engine/arg_utils.py", line 983, in create_model_config
2025-03-07 18:32:47 return ModelConfig(
2025-03-07 18:32:47 ^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/vllm/config.py", line 286, in init
2025-03-07 18:32:47 hf_config = get_config(self.model, trust_remote_code, revision,
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/vllm/transformers_utils/config.py", line 180, in get_config
2025-03-07 18:32:47 if is_gguf or file_or_path_exists(
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/vllm/transformers_utils/config.py", line 99, in file_or_path_exists
2025-03-07 18:32:47 return file_exists(model,
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
2025-03-07 18:32:47 return fn(*args, **kwargs)
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/hf_api.py", line 2885, in file_exists
2025-03-07 18:32:47 get_hf_file_metadata(url, token=token)
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
2025-03-07 18:32:47 return fn(*args, **kwargs)
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/file_download.py", line 1296, in get_hf_file_metadata
2025-03-07 18:32:47 r = _request_wrapper(
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/file_download.py", line 280, in _request_wrapper
2025-03-07 18:32:47 response = _request_wrapper(
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/file_download.py", line 303, in _request_wrapper
2025-03-07 18:32:47 response = get_session().request(method=method, url=url, **params)
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/requests/sessions.py", line 589, in request
2025-03-07 18:32:47 resp = self.send(prep, **send_kwargs)
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/requests/sessions.py", line 703, in send
2025-03-07 18:32:47 r = adapter.send(request, **kwargs)
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_http.py", line 96, in send
2025-03-07 18:32:47 return super().send(request, *args, **kwargs)
2025-03-07 18:32:47 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 18:32:47 File "/usr/local/lib/python3.11/dist-packages/requests/adapters.py", line 713, in send
2025-03-07 18:32:47 raise ReadTimeout(e, request=request)
2025-03-07 18:32:47 requests.exceptions.ReadTimeout: (ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 7bee7530-d2d2-4544-a13d-4e79cf7d19a8)')

encountered the same problem today

Hi

Thanks for reaching out!

We've recently added an ENTRYPOINT to our Dockerfile to automatically launch the vLLM service when the container starts. This might be the reason for the issue you're experiencing.

If you'd like to start the container and manually launch the service, you can override the default entrypoint by adding --entrypoint /bin/bash \ when starting the container. For example:

sudo docker run -itd \
    --net=host \
    --device=/dev/dri \
    --privileged
    --memory="32G" \
    --name=CONTAINER_NAME \
    --shm-size="16g" \
    --entrypoint /bin/bash \
    $DOCKER_IMAGE

Let us know if you have any further questions!