karpathy / minGPT

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to determine `warmup_tokens` and `final_tokens`?

fgolemo opened this issue · comments

Hey folks,

Thanks a lot for this implementation @karpathy! I was wondering how you got the values in the addition example:

warmup_tokens=1024,
final_tokens=50 * len(train_dataset) * (ndigit + 1),

And how does one estimate these for a different task (i.e. based on vocabulary, epochs, etc)?

Cheers,
Florian