surajhanchinal / spearenet

Torch and tinygrad implementation of Karpathy's nanogpt based off of shakespeare's text

The tinygrad version is very slow compared to pytorch probably because I am missing some detail about tinygrad. Even though it uses CUDA, seems like there is a chance for optimization here.

About

Torch and tinygrad implementation of Karpathy's nanogpt based off of shakespeare's text

Languages

Language:Python 100.0%