yandex / YaLM-100B

Pretrained language model with 100B parameters

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[NL] token

TatianaShavrina opened this issue · comments

What's the [NL] token appearing in generation?
Is it an artifact or a special token?

It's newline. You can replace it with \n.

Yes, it's the newline token in our tokenizer. To make it clearer, we have just added (deb045d) mapping it back to \n after detokenization (tokenization already took it into account). Thank you for noticing!