ckiplab / ckip-transformers

CKIP Transformers

Home Page:https://ckip-transformers.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Speed up tokenize.

emfomy opened this issue · comments

HuggingFace's tokenizer can also return the original indices.
We may rewrite the tokenization step using this feature instead of tokenizing character by character.

Use tokenizer without calling tokenize (convert to ID character by character).

Implemented in v0.2.0