nanollama with Chinese annotation, for self learning.
Codes were modified from Karpathy's nanoGPT, only core functions are included, added Chinese comments for easier understanding for self study.
Ref: https://github.com/karpathy/nanoGPT
Framework changed from nanoGPT -->
- gelu --> swiglu
- LayerNorm --> RMSNorm
- position embedding --> RoPE