My attempt to demystify neural networks in a journey of understanding LLMs
I am a big dummy. I need to understand things visually and from scratch. That mean re-plotting things like basic mathematical functions, getting better at using Numpy, Pandas, PyTorch, Matplotlib and Juptyer noteboks. The final objective is to understand how the much revered transformer works.
Start from the basics. Understand hyperplanes, gradient descent, activation functions then slowly move up to Neural Networks (even basic stuff like Perceptrons). Eventually get to Attention and LLMs.
I am also pursuing a course on this through Udemy. It'll help speed things up