Run Qwen GGUF by ipex-llm transformers python
KiwiHana opened this issue · comments
Need to support Deepseek-R1:32B and Qwen2.5-1.5B, Qwen2.5-3B GGUF by ipex-llm transformers
GGUF Q4_0, Q4_1 or Q4_K_M is OK. Support Q4_K_M is better.
C:\Users\Lengda\Documents\spec-decode-0325>kai-xpu-0324\python.exe gguf_speculative_decoding.py
model_family:qwen2
2025-03-31 13:56:47,437 - ERROR -
****************************Usage Error************************
Unsupported model family: qwen2
2025-03-31 13:56:47,437 - ERROR -
****************************Call Stack*************************
Traceback (most recent call last):
File "C:\Users\Lengda\Documents\spec-decode-0325\gguf_speculative_decoding.py", line 34, in <module>
model, tokenizer = AutoModelForCausalLM.from_gguf(checkpoint)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Lengda\Documents\spec-decode-0325\kai-xpu-0324\Lib\site-packages\ipex_llm\transformers\model.py", line 405, in from_gguf
model, tokenizer = load_gguf_model(fpath, dtype=torch.half, low_bit=low_bit)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Lengda\Documents\spec-decode-0325\kai-xpu-0324\Lib\site-packages\ipex_llm\transformers\gguf\api.py", line 73, in load_gguf_model
invalidInputError(False, f"Unsupported model family: {model_family}")
File "C:\Users\Lengda\Documents\spec-decode-0325\kai-xpu-0324\Lib\site-packages\ipex_llm\utils\common\log4Error.py", line 32, in invalidInputError
raise RuntimeError(errMsg)
RuntimeError: Unsupported model family: qwen2
Need to support Deepseek-R1:32B and Qwen2.5-1.5B, Qwen2.5-3B GGUF by ipex-llm transformers
GGUF Q4_0, Q4_1 or Q4_K_M is OK. Support Q4_K_M is better.
C:\Users\Lengda\Documents\spec-decode-0325>kai-xpu-0324\python.exe gguf_speculative_decoding.py model_family:qwen2 2025-03-31 13:56:47,437 - ERROR - ****************************Usage Error************************ Unsupported model family: qwen2 2025-03-31 13:56:47,437 - ERROR - ****************************Call Stack************************* Traceback (most recent call last): File "C:\Users\Lengda\Documents\spec-decode-0325\gguf_speculative_decoding.py", line 34, in <module> model, tokenizer = AutoModelForCausalLM.from_gguf(checkpoint) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Lengda\Documents\spec-decode-0325\kai-xpu-0324\Lib\site-packages\ipex_llm\transformers\model.py", line 405, in from_gguf model, tokenizer = load_gguf_model(fpath, dtype=torch.half, low_bit=low_bit) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Lengda\Documents\spec-decode-0325\kai-xpu-0324\Lib\site-packages\ipex_llm\transformers\gguf\api.py", line 73, in load_gguf_model invalidInputError(False, f"Unsupported model family: {model_family}") File "C:\Users\Lengda\Documents\spec-decode-0325\kai-xpu-0324\Lib\site-packages\ipex_llm\utils\common\log4Error.py", line 32, in invalidInputError raise RuntimeError(errMsg) RuntimeError: Unsupported model family: qwen2
Dude any idea to run qwen3 gguf model?
Need to support Deepseek-R1:32B and Qwen2.5-1.5B, Qwen2.5-3B GGUF by ipex-llm transformers
GGUF Q4_0, Q4_1 or Q4_K_M is OK. Support Q4_K_M is better.C:\Users\Lengda\Documents\spec-decode-0325>kai-xpu-0324\python.exe gguf_speculative_decoding.py model_family:qwen2 2025-03-31 13:56:47,437 - ERROR - ****************************Usage Error************************ Unsupported model family: qwen2 2025-03-31 13:56:47,437 - ERROR - ****************************Call Stack************************* Traceback (most recent call last): File "C:\Users\Lengda\Documents\spec-decode-0325\gguf_speculative_decoding.py", line 34, in <module> model, tokenizer = AutoModelForCausalLM.from_gguf(checkpoint) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Lengda\Documents\spec-decode-0325\kai-xpu-0324\Lib\site-packages\ipex_llm\transformers\model.py", line 405, in from_gguf model, tokenizer = load_gguf_model(fpath, dtype=torch.half, low_bit=low_bit) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Lengda\Documents\spec-decode-0325\kai-xpu-0324\Lib\site-packages\ipex_llm\transformers\gguf\api.py", line 73, in load_gguf_model invalidInputError(False, f"Unsupported model family: {model_family}") File "C:\Users\Lengda\Documents\spec-decode-0325\kai-xpu-0324\Lib\site-packages\ipex_llm\utils\common\log4Error.py", line 32, in invalidInputError raise RuntimeError(errMsg) RuntimeError: Unsupported model family: qwen2Dude any idea to run qwen3 gguf model?
you can use ollama-intel version: https://www.modelscope.cn/models/Intel/ollama