karpathy / minGPT

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Meaning of "-1 because very last digit doesn't plug back"

vwxyzjn opened this issue · comments

Hi, this is an awesome repository. I was reading on AdditionDataset and noticed the block size is calculated as follows:

# +1 due to potential carry overflow, but then -1 because very last digit doesn't plug back
self.block_size = ndigit + ndigit + ndigit + 1 - 1

The meaning of "then -1 because very last digit doesn't plug back" wasn't exactly clear to me. Did you mean "but then -1 because the last digit doesn't plug back and needs to be predicted"?