Missing need to be corrected
ruirui3364 opened this issue · comments
Ruirui commented
In "using dot product of complex numbers to rotate a vector" section, the code "freqs_cis.shape" doesn't have its output, which may make reader confused.
In "multi head attention" section, the line "qkv_attention = torch.matmul(qk_per_token_after_masking_after_softmax, v_per_token)" repeat twice.
Thank you.