InternLM / InternLM

Official release of InternLM2 7B and 20B base and chat models. 200K context support

Home Page:https://internlm.intern-ai.org.cn/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug] internlm2 emit content with [UNUSED_TOKEN_145] at times

gaord opened this issue · comments

Describe the bug

I am running quantized internlm2-chat-20b by llama.cpp with prompt template as in here. Chatting is going very quick well, however at times the model will produce [UNUSED_TOKEN_145].
When I change the stop word to "[UNUSED_TOKEN_145]" from <|im_end|>, every AI messages will be appended by additionally.
image

BTW, the quantization is done by workaround with rope scaling disabled.

It looks like the model's configuration exists bug to behave as this.

Environment

Mac m2 ultra
pytorch-lightning 2.1.0
torch 2.1.2
torchaudio 2.1.0
torchmetrics 1.2.0
torchvision 0.16.0

Other information

No response

This may be caused by the tokenizer config not being the latest version. Make sure the added_tokens_decoder in your tokenizer_config.json is the same with https://huggingface.co/internlm/internlm2-chat-20b/blob/main/tokenizer_config.json#L15

Please also ensure that the special token id mapping of the tokenizer converted using llama.cpp is consistent with this

{
      "<|plugin|>": 92538,
      "<|interpreter|>": 92539,
      "<|action_end|>": 92540,
      "<|action_start|>": 92541,
      "<|im_end|>": 92542,
      "<|im_start|>": 92543
}

In addition, the ending word of <eoh> in the picture is weird. Because in the two versions of chat templates before and after the update, the <eoh> ending word has never been used. It should be either [UNUSED_TOKEN_145] or <|im_end|>. Is it possible that this <eoh> is set somewhere else, causing the model to predict it because of few-shot learning? This is just my guess.

What is your transformer version?

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 7 days if the stale label is not removed or if there is no further response.

This issue is closed because it has been stale for 7 days. Please open a new issue if you have similar issues or you have any new updates now.