transfer_param.py 转换vincuna hf模型成sat模型报错

Question

transfer_param.py 转换vincuna hf模型成sat模型报错

Lunatic-Solar opened this issue 3 months ago · comments

Lunatic-Solar commented 3 months ago

Lunatic-Solar · Answer 1 · Mon Apr 08 2024 16:42:03 GMT+0800 (China Standard Time)

is_rotary_emb=True时报错

is_rotary_emb=False时报错

是环境的问题吗？安装package时使用的版本是？

Qingsong Lv · Answer 2 · Mon Apr 08 2024 17:19:10 GMT+0800 (China Standard Time)

请安装github最新版sat

git clone https://github.com/THUDM/SwissArmyTransformer
cd SwissArmyTransformer
pip install -e .

Lunatic-Solar · Answer 3 · Tue Apr 09 2024 16:00:02 GMT+0800 (China Standard Time)

这个问题解决了。但我在finetune vicuna的时候报错了，报错如下：
[图片]
看起来也是sat版本的问题？

Lunatic-Solar · Answer 4 · Tue Apr 09 2024 16:00:35 GMT+0800 (China Standard Time)

这个问题解决了。但我在finetune vicuna的时候报错了，报错如下： [图片] 看起来也是sat版本的问题？

Qingsong Lv · Answer 5 · Tue Apr 09 2024 16:07:06 GMT+0800 (China Standard Time)

cogvlm加载的checkpoint应该是旧版本的llama，如果你想用新版的llama可能需要修改一下cogvlm的代码。

两者的区别是新版sat把mlp内置了，不需要添加mixin。

Lunatic-Solar · Answer 6 · Thu Apr 18 2024 11:06:46 GMT+0800 (China Standard Time)

为什么会出现这样的报错，且vicuna转换后，结构似乎与cogVLM的模型结构不太一样
以下是转换后的vicuña

以下是cogvlm的模型结构

Qingsong Lv · Answer 7 · Thu Apr 18 2024 11:14:12 GMT+0800 (China Standard Time)

可能是你转换的不太对吧，转换脚本在这里：https://github.com/THUDM/SwissArmyTransformer/tree/main/examples/llama

Lunatic-Solar · Answer 8 · Thu Apr 18 2024 15:36:52 GMT+0800 (China Standard Time)

我使用了llama文件夹下的transform_param.py, 但是转换出来后，还是没有“eva_args”这部分结构，代码我也只是将模型文件改成了
prefix = 'lmsys/'
model_type = 'vicuna-7b-v1.5'
其他部分与原代码一致，为什么还是结构不一样，请问你们转换的时候有调整什么吗

Qingsong Lv · Answer 9 · Thu Apr 18 2024 17:07:05 GMT+0800 (China Standard Time)

eva_args肯定是没有的，因为转换的是语言模型。

Lunatic-Solar · Answer 10 · Thu Apr 18 2024 18:16:21 GMT+0800 (China Standard Time)

cogVLM不是基于vicuna finetune的吗？我也看到cogVLM的issue提到基于vincuna finetune和基于cogVLM finetune 没有区别。而finutune的代码中是需要eva_args结构里的数据的，那究竟是怎么基于vincuna finetune的呢？有没有可以参考的资料呢

Qingsong Lv · Answer 11 · Thu Apr 18 2024 19:22:40 GMT+0800 (China Standard Time)

直接用cogvlm的model_config.json加载你转换好的checkpoint文件。也就是说把cogvlm的model_config.json复制到你转换好的权重文件夹里。

Qingsong Lv · Answer 12 · Thu Apr 18 2024 19:24:23 GMT+0800 (China Standard Time)

如果需要预训练的vit权重，可以把vit部分直接添加到你转换好的checkpoint文件里。

paomian001 · Answer 13 · Mon May 20 2024 15:29:09 GMT+0800 (China Standard Time)

请安装github最新版sat

git clone https://github.com/THUDM/SwissArmyTransformer
cd SwissArmyTransformer
pip install -e .

请问我下载更新的时候出现了这种错误，您知道如何解决吗？
pip install -e .
Obtaining file:///home/root1/PycharmProjects/Inf-DiT-main/SwissArmyTransformer-main
Preparing metadata (setup.py) ... done
Requirement already satisfied: torch in /home/root1/anaconda3/envs/inf-dit/lib/python3.10/site-packages (from SwissArmyTransformer==0.4.11) (2.3.0)
Collecting deepspeed (from SwissArmyTransformer==0.4.11)
Using cached deepspeed-0.14.2.tar.gz (1.3 MB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [8 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "/tmp/pip-install-woqq3hon/deepspeed_6ad9a8143d084357a3f16a4f3891dace/setup.py", line 100, in
cuda_major_ver, cuda_minor_ver = installed_cuda_version()
File "/tmp/pip-install-woqq3hon/deepspeed_6ad9a8143d084357a3f16a4f3891dace/op_builder/builder.py", line 50, in installed_cuda_version
raise MissingCUDAException("CUDA_HOME does not exist, unable to compile CUDA op(s)")
op_builder.builder.MissingCUDAException: CUDA_HOME does not exist, unable to compile CUDA op(s)
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Qingsong Lv · Answer 14 · Mon May 20 2024 15:45:42 GMT+0800 (China Standard Time)

如果不用于训练的话可以这样试试：

pip install . --no-deps

否则就需要安装正确的cuda版本

paomian001 · Answer 15 · Mon May 20 2024 15:49:02 GMT+0800 (China Standard Time)

如果不用于训练的话可以这样试试：
pip install . --no-deps
否则就需要安装正确的cuda版本

好的谢谢，但是这个对应的cuda版本和python版本如何查看呢

Qingsong Lv · Answer 16 · Mon May 20 2024 15:50:14 GMT+0800 (China Standard Time)

import torch
print(torch.version.cuda)

paomian001 · Answer 17 · Mon May 20 2024 16:09:27 GMT+0800 (China Standard Time)

import torch
print(torch.version.cuda)

谢谢你我尝试一下