howl-anderson / MicroTokenizer

一个轻量且功能全面的中文分词器,帮助学生了解分词器的工作原理。MicroTokenizer: A lightweight Chinese tokenizer designed for educational and research purposes. Provides a practical, hands-on approach to understanding NLP concepts, featuring multiple tokenization algorithms and customizable models. Ideal for students, researchers, and NLP enthusiasts..

Home Page:https://nlp.xiaoquankong.ai

Repository from Github https://github.comhowl-anderson/MicroTokenizerRepository from Github https://github.comhowl-anderson/MicroTokenizer

howl-anderson/MicroTokenizer Stargazers