intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, Axolotl, etc.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feature]internlm-xcomposer2-vl-7b support

kevin-t-tang opened this issue · comments

Model https://huggingface.co/internlm/internlm-xcomposer2-vl-7b

modify from qwen-vl and got the following error
Line 80:
query = 'Please describe this image in detail.'
image = '5602445367_3504763978_z.jpg'
response, _ = model.chat(tokenizer, query=query, image=image, history=[], do_sample=False)
torch.xpu.synchronize()

/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/transformers/generation/utils.py:1270: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation )
warnings.warn(
Traceback (most recent call last):
File "/opt/WD/091-GFX-Benchmark/BigDL/python/llm/example/GPU/HF-Transformers-AutoModels/Model/qwen-vl/./chat.py", line 85, in
response, _ = model.chat(tokenizer, query=query, image=image, history=[], do_sample=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2-vl-7b/modeling_internlm_xcomposer2.py", line 511, in chat
outputs = self.generate(
^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/ipex_llm/transformers/lookup.py", line 88, in generate
return original_generate(self,
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/ipex_llm/transformers/speculative.py", line 109, in generate
return original_generate(self,
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/transformers/generation/utils.py", line 1538, in generate
return self.greedy_search(
^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/transformers/generation/utils.py", line 2362, in greedy_search
outputs = self(
^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2-vl-7b/modeling_internlm_xcomposer2.py", line 366, in forward
outputs = self.model(
^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2-vl-7b/modeling_internlm2.py", line 929, in forward
layer_outputs = decoder_layer(
^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2-vl-7b/modeling_internlm2.py", line 625, in forward
hidden_states, self_attn_weights, present_key_value = self.attention(
^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2-vl-7b/modeling_internlm2.py", line 391, in forward
qkv_states = self.wqkv(hidden_states, im_mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/llm-qwen-vl/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: LowBitLinear.forward() takes 2 positional arguments but 3 were given

latest ipex-llm (2.1.0b20240523 version) already supports internlm-xcomposer2-vl-7b, please try it. (for now, only transformers 4.31 is supported)

more optimization is in progress.

import torch

from ipex_llm.transformers import AutoModelForCausalLM
from transformers import AutoTokenizer

ckpt_path = "<internlm-xcomposer2-vl-7b>"
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(ckpt_path, trust_remote_code=True,
                                             load_in_low_bit="sym_int8")  # use sym_int8 for better output
model = model.eval()

query = '<ImageHere>Please describe this image in detail.'
image = './image1.webp'

model = model.to('xpu')

response, _ = model.chat(tokenizer, query=query, image=image, history=[], do_sample=False)
print(response)

Thanks and close the issue.

@MeouSker77 looks use the lastest version and got the following errors

Traceback (most recent call last):
File "/opt/WD/009-models/models/internlm-xcomposer2-vl-7b/demo.py", line 8, in
model = AutoModelForCausalLM.from_pretrained(ckpt_path, trust_remote_code=True,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/a770/miniforge3/envs/llm-test-linux/lib/python3.11/unittest/mock.py", line 1378, in patched
return func(*newargs, **newkeywargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/a770/miniforge3/envs/llm-test-linux/lib/python3.11/site-packages/ipex_llm/transformers/model.py", line 347, in from_pretrained
model = cls.load_convert(q_k, optimize_model, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/a770/miniforge3/envs/llm-test-linux/lib/python3.11/site-packages/ipex_llm/transformers/model.py", line 483, in load_convert
model = model.to("cpu")
^^^^^^^^^^^^^^^
File "/home/a770/miniforge3/envs/llm-test-linux/lib/python3.11/site-packages/transformers/modeling_utils.py", line 2460, in to
return super().to(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/a770/miniforge3/envs/llm-test-linux/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1160, in to
return self._apply(convert)
^^^^^^^^^^^^^^^^^^^^
File "/home/a770/miniforge3/envs/llm-test-linux/lib/python3.11/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/home/a770/miniforge3/envs/llm-test-linux/lib/python3.11/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/home/a770/miniforge3/envs/llm-test-linux/lib/python3.11/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
[Previous line repeated 1 more time]
File "/home/a770/miniforge3/envs/llm-test-linux/lib/python3.11/site-packages/torch/nn/modules/module.py", line 833, in _apply
param_applied = fn(param)
^^^^^^^^^
File "/home/a770/miniforge3/envs/llm-test-linux/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1158, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: Cannot copy out of meta tensor; no data!

this model's code has some bug, so we can only use transformers 4.31