naklecha / llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

embeddings Error

goodhaibin opened this issue · comments

I know that this project uses the Meta-Llama-3-8B model by default, but when I switch to Meta-Llama-3-70B-Instruct, an error occurs during token embedding. The error is due to a mismatch in the dimensions of the tok_embeddings.weight between Meta-Llama-3-8B and Meta-Llama-3-70B-Instruct, even though the vocab_size of both models is the same. Why is this happening?


RuntimeError Traceback (most recent call last)
Cell In[39], line 3
1 embedding_layer = torch.nn.Embedding(vocab_size, dim)
2 print(model["tok_embeddings.weight"].shape)
----> 3 embedding_layer.weight.data.copy_(model["tok_embeddings.weight"])
4 token_embeddings_unnormalized = embedding_layer(tokens).to(torch.bfloat16)
5 token_embeddings_unnormalized.shape

RuntimeError: The size of tensor a (128256) must match the size of tensor b (16032) at non-singleton dimension 0