lyogavin / Anima

33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error with Llama3: ValueError: Trying to set a tensor of shape torch.Size([1024, 8192]) in "weight" (which has shape torch.Size([8192, 8192])), this look incorrect.

Cangshanqingshi opened this issue · comments

My server can't fetch the model from huggingfacce online, so I downloaded the pytorch version model instead of the safetensors version from huggingfacce to run it locally. For this reason, I change the code whose aim is loadding model . After this change the code is as below:

from airllm import AutoModel

MAX_LENGTH = 128
# could use hugging face model repo id:
# model = AutoModel.from_pretrained("garage-bAInd/Platypus2-70B-instruct")

# or use model's local path...
model = AutoModel.from_pretrained("./llama3_model", layer_shards_saving_path="./")

input_text = [
        'What is the capital of United States?',
        #'I like',
    ]

input_tokens = model.tokenizer(input_text,
    return_tensors="pt", 
    return_attention_mask=False, 
    truncation=True, 
    max_length=MAX_LENGTH, 
    padding=False)
           
generation_output = model.generate(
    input_tokens['input_ids'].cuda(), 
    max_new_tokens=20,
    use_cache=True,
    return_dict_in_generate=True)

output = model.tokenizer.decode(generation_output.sequences[0])
print(output)

When I run this code, the model can be loaded successfuly. But an accident with the shape of tensor takes place. The error message is as below:

found index file...
found_layers:{'model.embed_tokens.': True, 'model.layers.0.': True, 'model.layers.1.': True, 'model.layers.2.': True, 'model.layers.3.': True, 'model.layers.4.': True, 'model.layers.5.': True, 'model.layers.6.': True, 'model.layers.7.': True, 'model.layers.8.': True, 'model.layers.9.': True, 'model.layers.10.': True, 'model.layers.11.': True, 'model.layers.12.': True, 'model.layers.13.': True, 'model.layers.14.': True, 'model.layers.15.': True, 'model.layers.16.': True, 'model.layers.17.': True, 'model.layers.18.': True, 'model.layers.19.': True, 'model.layers.20.': True, 'model.layers.21.': True, 'model.layers.22.': True, 'model.layers.23.': True, 'model.layers.24.': True, 'model.layers.25.': True, 'model.layers.26.': True, 'model.layers.27.': True, 'model.layers.28.': True, 'model.layers.29.': True, 'model.layers.30.': True, 'model.layers.31.': True, 'model.layers.32.': True, 'model.layers.33.': True, 'model.layers.34.': True, 'model.layers.35.': True, 'model.layers.36.': True, 'model.layers.37.': True, 'model.layers.38.': True, 'model.layers.39.': True, 'model.layers.40.': True, 'model.layers.41.': True, 'model.layers.42.': True, 'model.layers.43.': True, 'model.layers.44.': True, 'model.layers.45.': True, 'model.layers.46.': True, 'model.layers.47.': True, 'model.layers.48.': True, 'model.layers.49.': True, 'model.layers.50.': True, 'model.layers.51.': True, 'model.layers.52.': True, 'model.layers.53.': True, 'model.layers.54.': True, 'model.layers.55.': True, 'model.layers.56.': True, 'model.layers.57.': True, 'model.layers.58.': True, 'model.layers.59.': True, 'model.layers.60.': True, 'model.layers.61.': True, 'model.layers.62.': True, 'model.layers.63.': True, 'model.layers.64.': True, 'model.layers.65.': True, 'model.layers.66.': True, 'model.layers.67.': True, 'model.layers.68.': True, 'model.layers.69.': True, 'model.layers.70.': True, 'model.layers.71.': True, 'model.layers.72.': True, 'model.layers.73.': True, 'model.layers.74.': True, 'model.layers.75.': True, 'model.layers.76.': True, 'model.layers.77.': True, 'model.layers.78.': True, 'model.layers.79.': True, 'model.norm.': True, 'lm_head.': True}
saved layers already found in splitted_model
new version of transfomer, no need to use BetterTransformer, try setting attn impl to sdpa...
either BetterTransformer or attn_implementation='sdpa' is available, creating model directly
new version of transfomer, no need to use BetterTransformer, try setting attn impl to sdpa...
either BetterTransformer or attn_implementation='sdpa' is available, creating model directly
running layers(self.running_device):   1%|          | 1/83 [00:00<00:35,  2.29it/s]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[3], [line 22](vscode-notebook-cell:?execution_count=3&line=22)
     [10](vscode-notebook-cell:?execution_count=3&line=10) input_text = [
     [11](vscode-notebook-cell:?execution_count=3&line=11)         'What is the capital of United States?',
     [12](vscode-notebook-cell:?execution_count=3&line=12)         #'I like',
     [13](vscode-notebook-cell:?execution_count=3&line=13)     ]
     [15](vscode-notebook-cell:?execution_count=3&line=15) input_tokens = model.tokenizer(input_text,
     [16](vscode-notebook-cell:?execution_count=3&line=16)     return_tensors="pt", 
     [17](vscode-notebook-cell:?execution_count=3&line=17)     return_attention_mask=False, 
     [18](vscode-notebook-cell:?execution_count=3&line=18)     truncation=True, 
     [19](vscode-notebook-cell:?execution_count=3&line=19)     max_length=MAX_LENGTH, 
     [20](vscode-notebook-cell:?execution_count=3&line=20)     padding=False)
---> [22](vscode-notebook-cell:?execution_count=3&line=22) generation_output = model.generate(
     [23](vscode-notebook-cell:?execution_count=3&line=23)     input_tokens['input_ids'].cuda(), 
     [24](vscode-notebook-cell:?execution_count=3&line=24)     max_new_tokens=20,
     [25](vscode-notebook-cell:?execution_count=3&line=25)     use_cache=True,
     [26](vscode-notebook-cell:?execution_count=3&line=26)     return_dict_in_generate=True)
     [28](vscode-notebook-cell:?execution_count=3&line=28) output = model.tokenizer.decode(generation_output.sequences[0])
     [30](vscode-notebook-cell:?execution_count=3&line=30) print(output)

File /data2/lhy/anaconda3/envs/mm2024-ChinaOpen/lib/python3.10/site-packages/torch/utils/_contextlib.py:115, in context_decorator..decorate_context(*args, **kwargs)
    [112](https://vscode-remote+ssh-002dremote-002b10-002e77-002e110-002e126.vscode-resource.vscode-cdn.net/data2/lhy/anaconda3/envs/mm2024-ChinaOpen/lib/python3.10/site-packages/torch/utils/_contextlib.py:112) @functools.wraps(func)
    [113](https://vscode-remote+ssh-002dremote-002b10-002e77-002e110-002e126.vscode-resource.vscode-cdn.net/data2/lhy/anaconda3/envs/mm2024-ChinaOpen/lib/python3.10/site-packages/torch/utils/_contextlib.py:113) def decorate_context(*args, **kwargs):
    [114](https://vscode-remote+ssh-002dremote-002b10-002e77-002e110-002e126.vscode-resource.vscode-cdn.net/data2/lhy/anaconda3/envs/mm2024-ChinaOpen/lib/python3.10/site-packages/torch/utils/_contextlib.py:114)     with ctx_factory():
...
    [349](https://vscode-remote+ssh-002dremote-002b10-002e77-002e110-002e126.vscode-resource.vscode-cdn.net/data2/lhy/anaconda3/envs/mm2024-ChinaOpen/lib/python3.10/site-packages/accelerate/utils/modeling.py:349)     if dtype is None:
    [350](https://vscode-remote+ssh-002dremote-002b10-002e77-002e110-002e126.vscode-resource.vscode-cdn.net/data2/lhy/anaconda3/envs/mm2024-ChinaOpen/lib/python3.10/site-packages/accelerate/utils/modeling.py:350)         # For compatibility with PyTorch load_state_dict which converts state dict dtype to existing dtype in model
    [351](https://vscode-remote+ssh-002dremote-002b10-002e77-002e110-002e126.vscode-resource.vscode-cdn.net/data2/lhy/anaconda3/envs/mm2024-ChinaOpen/lib/python3.10/site-packages/accelerate/utils/modeling.py:351)         value = value.to(old_value.dtype)

ValueError: Trying to set a tensor of shape torch.Size([1024, 8192]) in "weight" (which has shape torch.Size([8192, 8192])), this look incorrect.

I wanna know how to address this issue.