karpathy / llm.c

LLM training in simple, raw C/CUDA

Repository from Github https://github.comkarpathy/llm.cRepository from Github https://github.comkarpathy/llm.c

void tokenizer_init failed

Bing1002 opened this issue · comments

allocated 474 MiB for model parameters
train_gpt2fp32cu: train_gpt2_fp32.cu:1815: void tokenizer_init(Tokenizer*, const char*): Assertion `header[1] == 1' failed.
[1] 2229854 abort (core dumped) ./train_gpt2fp32cu

Try to rebuild your data files with the train_gpt3.py. The tokenizer headers have changed.

Hey @Bing1002 if you're not facing this problem any more feel free to close the issue, try and rerun python script and then run your C script again and it should be fine.

Thank you for your response.