ChineseTokenizers
A tokenizer based on Tokenizers.
Additional features
- Jieba Pre-tokenizer
- ChineseWordPiece Model (based on Yuan-1.0)
Examples
Yuan Preprocessor
RAYON_NUM_THREADS=48 TOKENIZERS_PARALLELISM=1 cargo run --release --example yuan
A tokenizer based on Tokenizers.
RAYON_NUM_THREADS=48 TOKENIZERS_PARALLELISM=1 cargo run --release --example yuan
Apache License 2.0