rehanganapathy / llm-dive

Learning the building blocks of LLMs and how they work.

llm-dive

Learning the building blocks of LLMs and how they work. First we build a bigram model which essentially predicts one character at a time. Next, we try to model the transformer arcitecture and train the openwebtext archive on it.

About

Learning the building blocks of LLMs and how they work.

Languages

Language:Jupyter Notebook 100.0%