OpenGVLab / LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

I have problem with downloading 7B_chinese in imagebind_LLM.

JINAILAB opened this issue · comments

i try to run python3 get_chinese_llama.py. i have some problem on it.
First, there is an error with No module named 'torch._six' at misc.py. I updated to from torch import six.
Secondly, i have a problem with
Traceback (most recent call last):
File "/home/model/imagellm/LLaMA-Adapter/imagebind_LLM/tools/get_chinese_llama.py", line 36, in
new_value = (ori_dict[key].float() + delta_dict[key].float()).half()
KeyError: 'rope.freqs'

So i check the key in ori_dict and delta_dict. i find a difference between rop.freqs.'

this is ori_dict(llama_7b) rop.freqs:

layers.0.attention.inner_attention.rope.freqs
layers.1.attention.inner_attention.rope.freqs
layers.2.attention.inner_attention.rope.freqs
layers.3.attention.inner_attention.rope.freqs
layers.4.attention.inner_attention.rope.freqs
layers.5.attention.inner_attention.rope.freqs
layers.6.attention.inner_attention.rope.freqs
layers.7.attention.inner_attention.rope.freqs
layers.8.attention.inner_attention.rope.freqs
layers.9.attention.inner_attention.rope.freqs
layers.10.attention.inner_attention.rope.freqs
layers.11.attention.inner_attention.rope.freqs
layers.12.attention.inner_attention.rope.freqs
layers.13.attention.inner_attention.rope.freqs
layers.14.attention.inner_attention.rope.freqs
layers.15.attention.inner_attention.rope.freqs
layers.16.attention.inner_attention.rope.freqs
layers.17.attention.inner_attention.rope.freqs
layers.18.attention.inner_attention.rope.freqs
layers.19.attention.inner_attention.rope.freqs
layers.20.attention.inner_attention.rope.freqs
layers.21.attention.inner_attention.rope.freqs
layers.22.attention.inner_attention.rope.freqs
layers.23.attention.inner_attention.rope.freqs
layers.24.attention.inner_attention.rope.freqs
layers.25.attention.inner_attention.rope.freqs
layers.26.attention.inner_attention.rope.freqs
layers.27.attention.inner_attention.rope.freqs
layers.28.attention.inner_attention.rope.freqs
layers.29.attention.inner_attention.rope.freqs
layers.30.attention.inner_attention.rope.freqs
layers.31.attention.inner_attention.rope.freqs

and this is the 7B_chinese_delta rope.freqs:

rope.freqs.

so i correct the code /imagebind_LLM/tools/get_chinese_llama.py like this. is this a right code?

for key in ori_dict:
    if key in delta_dict.keys():
        new_value = (ori_dict[key].float() + delta_dict[key].float()).half()
        new_dict[key] = new_value
    else:
        new_value = ori_dict[key].float()
        new_dict[key] = new_value

Thanks for pointing out that! Actually, we do not need rops in the checkpoint (they will not be loaded by the model), so we can just skip these params.