karpathy / minGPT

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about memory usage for play_math

pablogranolabar opened this issue · comments

I've been experimenting with minGPT / play_math for the purpose of seeing if multiplication is possible. I've got a somewhat anemic GTX1060 with only 6GB of memory, when attempting to expand the sequence size width of ndigit = 3 works but anything above that results in a SIGKILL which I am assuming is GPU OOM. But what's weird is that with ndigit = 3 the GPU is occupied with only 705MiB, why would ndigit = 4 result in OOM?

My target goal right now is simple multiplication with 24-bit integers, any advice on model refinements would be greatly appreciated.

Great project karpathy ;)