I followed Andrej Karpathy's GPT tutorial: https://youtu.be/kCc8FmEb1nY?si=E5Sp1cNuphxl-LIX
I made minor modifications, mostly in line with things from his nanoGPT repo: https://github.com/karpathy/nanoGPT
I commented things extensively, but only possibly helpfully to anyone who is not me. They are mostly my thoughts as I reasoned through what things were and why they were there.
I plan to continue with Andrej's resources, such as the checkpoints in nanoGPT, and the tokenizer video he had.