liyucheng09 / Selective_Context

Compress your input to ChatGPT or other LLMs, to let them process 2x more content and save 40% memory and GPU time.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to work with Chinese?

SmileSmith opened this issue · comments

commented

When select Chinese content result is something like bytes:\x8a
image

Are using gpt2 model? It might be the tokenizer issue. Will fix it soon.

Now you can specify the language you're working on. Choose zh to switch the base model from gpt2 to uer/gpt2-chinese-cluecorpussmall. Bug fixed.