Tlntin / Qwen-TensorRT-LLM

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

运行build文件报错: TypeError: RowLinear.__init__() got an unexpected keyword argument 'instance_id'

ArlanCooper opened this issue · comments

我这边是根据TensorRT-LLM官方安装的版本:

torch2.1.2
tensorrt-llm                  0.9.0.dev2024030500

运行官方qwen-7b-chat没有问题

现在运行qwen2的qwen1.5-14b-chat脚本:

python build.py --hf_model_dir qwen/Qwen1.5-14B-Chat/ \
                --dtype float16 \
                --remove_input_padding \
                --use_gpt_attention_plugin float16 \
                --enable_context_fmha \
                --use_gemm_plugin float16 \
                --output_dir ./tmp/Qwen1.5/14B/trt_engines/fp16/1-gpu/

报错信息如下:

Traceback (most recent call last):
  File "/home/powerop/work/rwq/Qwen-TensorRT-LLM/examples/qwen2/build.py", line 776, in <module>
    build(0, args)
  File "/home/powerop/work/rwq/Qwen-TensorRT-LLM/examples/qwen2/build.py", line 741, in build
    engine = build_rank_engine(
  File "/home/powerop/work/rwq/Qwen-TensorRT-LLM/examples/qwen2/build.py", line 506, in build_rank_engine
    tensorrt_llm_qwen = Qwen2ForCausalLM_TRT(
  File "/home/powerop/work/rwq/Qwen-TensorRT-LLM/examples/qwen2/model.py", line 756, in __init__
    super().__init__(
  File "/home/powerop/work/rwq/Qwen-TensorRT-LLM/examples/qwen2/model.py", line 613, in __init__
    self.layers = ModuleList([
  File "/home/powerop/work/rwq/Qwen-TensorRT-LLM/examples/qwen2/model.py", line 614, in <listcomp>
    Qwen2DecoderLayer(
  File "/home/powerop/work/rwq/Qwen-TensorRT-LLM/examples/qwen2/model.py", line 490, in __init__
    self.self_attn = QWen2Attention(
  File "/home/powerop/work/rwq/Qwen-TensorRT-LLM/examples/qwen2/model.py", line 312, in __init__
    self.o_proj = RowLinear(hidden_size,
TypeError: RowLinear.__init__() got an unexpected keyword argument 'instance_id'


请问一下是什么问题,需要和本脚本的tensor-llm版本保持一致吗?

是的,目前必须版本一样,基本没有跨版本的兼容能力(可以去nvidia那边吐槽😂)。

好的,谢谢~