I have problem with downloading 7B_chinese in imagebind_LLM.
JINAILAB opened this issue · comments
i try to run python3 get_chinese_llama.py. i have some problem on it.
First, there is an error with No module named 'torch._six' at misc.py. I updated to from torch import six
.
Secondly, i have a problem with
Traceback (most recent call last):
File "/home/model/imagellm/LLaMA-Adapter/imagebind_LLM/tools/get_chinese_llama.py", line 36, in
new_value = (ori_dict[key].float() + delta_dict[key].float()).half()
KeyError: 'rope.freqs'
So i check the key in ori_dict and delta_dict. i find a difference between rop.freqs
.'
this is ori_dict(llama_7b) rop.freqs:
layers.0.attention.inner_attention.rope.freqs
layers.1.attention.inner_attention.rope.freqs
layers.2.attention.inner_attention.rope.freqs
layers.3.attention.inner_attention.rope.freqs
layers.4.attention.inner_attention.rope.freqs
layers.5.attention.inner_attention.rope.freqs
layers.6.attention.inner_attention.rope.freqs
layers.7.attention.inner_attention.rope.freqs
layers.8.attention.inner_attention.rope.freqs
layers.9.attention.inner_attention.rope.freqs
layers.10.attention.inner_attention.rope.freqs
layers.11.attention.inner_attention.rope.freqs
layers.12.attention.inner_attention.rope.freqs
layers.13.attention.inner_attention.rope.freqs
layers.14.attention.inner_attention.rope.freqs
layers.15.attention.inner_attention.rope.freqs
layers.16.attention.inner_attention.rope.freqs
layers.17.attention.inner_attention.rope.freqs
layers.18.attention.inner_attention.rope.freqs
layers.19.attention.inner_attention.rope.freqs
layers.20.attention.inner_attention.rope.freqs
layers.21.attention.inner_attention.rope.freqs
layers.22.attention.inner_attention.rope.freqs
layers.23.attention.inner_attention.rope.freqs
layers.24.attention.inner_attention.rope.freqs
layers.25.attention.inner_attention.rope.freqs
layers.26.attention.inner_attention.rope.freqs
layers.27.attention.inner_attention.rope.freqs
layers.28.attention.inner_attention.rope.freqs
layers.29.attention.inner_attention.rope.freqs
layers.30.attention.inner_attention.rope.freqs
layers.31.attention.inner_attention.rope.freqs
and this is the 7B_chinese_delta rope.freqs:
rope.freqs.
so i correct the code /imagebind_LLM/tools/get_chinese_llama.py like this. is this a right code?
for key in ori_dict:
if key in delta_dict.keys():
new_value = (ori_dict[key].float() + delta_dict[key].float()).half()
new_dict[key] = new_value
else:
new_value = ori_dict[key].float()
new_dict[key] = new_value
Thanks for pointing out that! Actually, we do not need rops
in the checkpoint (they will not be loaded by the model), so we can just skip these params.