Tokenization on-the-fly without slowdown
sagadre opened this issue · comments
Benchmark and get tokenization on-the-fly to be as fast as training on pre-tokenized data
A repository for research on medium sized language models.
sagadre opened this issue · comments
Benchmark and get tokenization on-the-fly to be as fast as training on pre-tokenized data