Jian Yuan's repositories
IKAnalyzer
An open source word breaker with lucene supported.
996.ICU
Repo for counting stars and contributing. Press F to pay respect to glorious developers.
ClickHouse
ClickHouse® is a free analytics DBMS for big data
elasticsearch-analysis-ik
The IK Analysis plugin integrates Lucene IK analyzer into elasticsearch, support customized dictionary.
elasticsearch-analysis-jieba
The plugin includes the `jieba` analyzer, `jieba` tokenizer, and `jieba` token filter, and have two mode you can choose. one is `index` which means it will be used when you want to index a document. another is `search` mode which used when you want to search something.
elasticsearch-hadoop
:elephant: Elasticsearch real-time search and analytics natively integrated with Hadoop
HanLP
汉语言处理包 中文分词 词性标注 命名实体识别 依存句法分析 关键词提取 自动摘要 短语提取 拼音 简繁转换
kkndme_tianya
天涯 kkndme 神贴聊房价
learn-machine
IOL test code
ltp
Language Technology Platform
mlcsseg
solr分词器大补贴, 包括IK ANSJ、过滤器,动态加载词库
snownlp
Python library for processing Chinese text
starrocks
StarRocks is a next-gen sub-second MPP database for full analysis senarios, including multi-dimensional analytics, real-time analytics and ad-hoc query, formerly known as DorisDB.
word
Java分布式中文分词组件 - word分词