rehanganapathy / llm-dive

Learning the building blocks of LLMs and how they work.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

llm-dive

Learning the building blocks of LLMs and how they work. First we build a bigram model which essentially predicts one character at a time. Next, we try to model the transformer arcitecture and train the openwebtext archive on it.

About

Learning the building blocks of LLMs and how they work.


Languages

Language:Jupyter Notebook 100.0%