Learning the building blocks of LLMs and how they work. First we build a bigram model which essentially predicts one character at a time. Next, we try to model the transformer arcitecture and train the openwebtext archive on it.
Learning the building blocks of LLMs and how they work.
Learning the building blocks of LLMs and how they work. First we build a bigram model which essentially predicts one character at a time. Next, we try to model the transformer arcitecture and train the openwebtext archive on it.
Learning the building blocks of LLMs and how they work.