Support Qwen-1.8B-Chat
Vincent131499 opened this issue · comments
MeteorMan commented
Thanks for the excellent work!
I am using the Qwen-1.8B-Chat and found the following bug. After investigation, 1.8B is not adapted. Can you support it?
python scripts/inference.py --model_name qwen -m qwen-1_8b-ne-q4_j.bin -c 512 -b 1024 -n 256 -t 20 --color -p "She opened the door and see"
Namespace(model_name='qwen', model=PosixPath('qwen-1_8b-ne-q4_j.bin'), build_dir=PosixPath('/code/llm-workspace/cpu-workspace/intel-extension-for-transformers/intel_extension_for_transformers/llm/runtime/graph/scripts/../build'), prompt='She opened the door and see', tokenizer='THUDM/chatglm-6b', n_predict=256, threads=20, batch_size_truncate=1024, ctx_size=512, seed=-1, repeat_penalty=1.1, color=True, keep=0, shift_roped_k=False, memory_f32=False, memory_f16=False, memory_auto=False)
cmd: [PosixPath('/code/llm-workspace/cpu-workspace/intel-extension-for-transformers/intel_extension_for_transformers/llm/runtime/graph/scripts/../build/bin/run_qwen'), '--model', PosixPath('qwen-1_8b-ne-q4_j.bin'), '--prompt', 'She opened the door and see', '--n-predict', '256', '--threads', '20', '--batch-size-truncate', '1024', '--ctx-size', '512', '--seed', '-1', '--repeat-penalty', '1.1', '--keep', '0', '--color', '--ids', '195, 660, 2255, 100, 2144, 102, 275, 130001, 130004, 196']
Welcome to use the qwen on the ITREX!
main: seed = 1705373295
AVX:1 AVX2:1 AVX512F:1 AVX_VNNI:0 AVX512_VNNI:1 AMX_INT8:0 AMX_BF16:0 AVX512_BF16:0 AVX512_FP16:0
model.cpp: loading model from qwen-1_8b-ne-q4_j.bin
init: n_vocab = 151936
init: n_embd = 2048
init: n_mult = 11008
init: n_head = 16
init: n_layer = 24
init: n_rot = 128
init: n_ff = 5504
init: n_parts = 1
MODEL_ASSERT: /code/llm-workspace/cpu-workspace/intel-extension-for-transformers/intel_extension_for_transformers/llm/runtime/graph/models/qwen/qwen.h:34: false
MeteorMan commented
Looking forward to reply.
Dong, Bo commented
Thanks for your supports.
we need inserts it to our schedule, as long as we support it, we will let you know.
intellinjun commented
@Vincent131499 You can try again with this pr
MeteorMan commented
@Vincent131499 You can try again with this pr
ok! I'll try this.