- ConvNetJS, DeepLearning in Your Browser
- TensorFlow Examples
- Code for Hands-on ML using Scikit-Learn, Tensorflow, and Keras
- TensorFlow Tutorials
- Code for ML with PyTorch and Sci-kit Learn
- Large Model Parallelism Notebook
- Datascience IPython Notebooks
- Code for Artificial Intelligence, A Modern Approach
- Christopher Olah's Blog
- Code for NNs and DeepLearning book
- ML Fundamentals lectures
- The Unreasonable Effectiveness of RNNs, Andrej Karpathy
- Minimal and Clean RL examples
- Python NumPy Tutorial, Stanford
- Which GPUs to get for Deep Learning
- The Bitter Lesson
- Deep Learning Tuning Playbook, Google Research
- Micrograd, Andrej karpathy fav <3
- Automatic Differentiation, Mark Saroufim
- Machine Learning Tutorials
- What is Softmax? Sebastian Raschka
- Yes you should understand backprop, Andrej Karpathy
- Train with mixed-precision, Nvidia
- The Vanishing Gradient Problem
- Image Kernels
- Distill
- Making Deep Learning go Brrr from First Principles
- Circuits in NNs, OpenAI
- Too much efficiency makes everything worse: overfitting and the strong version of Goodhart's law
- Diffusion Models from Scratch
- Perspectives on Diffusion
- The Matrix Calculus You Need For Deep Learning, Jeremy Howard
- Attention is Off By One, Evan Miller
- RWKV Explained
- The Hardware Lottery, Sarah Hooker
- Recurrent Neural Networks, Stanford CS-230
- Hyena Hierarchy: Towards Larger Convolutional Language Models
- Simplifying S4
- No SRAM Scaling Implies on More Expensive CPUs and GPUs - TSMC
- Optimizing LLM Latency, Hamel Husain
- SLURM Survival Guide
- LLM Training, Replit
- RLHF, Chip Huyen
For more, look into starred repos on my profile ;)