vllm failure in intelanalytics/ipex-llm-serving-xpu:2.2.0-b13
oldmikeyang opened this issue · comments
Describe the bug
Use the Open-webui connect to the vllm server, the vllm server will crash with the following information.
"'OpenAIServingTokenization' object has no attribute 'show_available_models'"
How to reproduce
Steps to reproduce the error:
- docker pull the intelanalytics/ipex-llm-serving-xpu:2.2.0-b13
- start the vllm serving for Qwen 14B
- start Open-webui connect to the vllm server, through OpenAI API.
Screenshots
Environment information
This issue only exisit on the docker image release 2.2.0-b13. docker image release 2.2.0-b11 don't have this issue
intelanalytics/ipex-llm-serving-xpu 2.2.0-b13
Additional context
Add any other context about the problem here.
hi @oldmikeyang, please try 2.2.0-b15. This is fixed in b15.
Yes. b16 release don't have this issue.
