Missing need to be corrected

Question

Missing need to be corrected

ruirui3364 opened this issue 7 months ago · comments

In "using dot product of complex numbers to rotate a vector" section, the code "freqs_cis.shape" doesn't have its output, which may make reader confused.

In "multi head attention" section, the line "qkv_attention = torch.matmul(qk_per_token_after_masking_after_softmax, v_per_token)" repeat twice.

Thank you.