There are 97 repositories under chinese-nlp topic.
:orange_book: 中华新华字典数据库。包括歇后语,成语,词语,汉字。
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
A curated list of resources for Chinese NLP 中文自然语言处理相关资料
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取
Datasets, SOTA results of every fields of Chinese NLP
Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction implemented based on TEXTRANK algorithm. Jcseg had a build-in http server and search modules for lucene,solr,elasticsearch,opensearch
:four_leaf_clover: Another Chinese chatbot implemented in PyTorch, which is the sub-module of intelligent work order processing robot. 👩🔧
An open-source educational chat model from ICALK, East China Normal University. 开源中英教育对话大模型。(通用基座模型,GPU部署,数据清理) 致敬: LLaMA, MOSS, BELLE, Ziya, vLLM
SpaCy 中文模型 | Models for SpaCy that support Chinese
Rime Cantonese input schema | 粵語拼音輸入方案
微信公众号语料库
AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence Understanding Models
chinese NLP corpus of chinese science fiction,chinese science fiction corpus : About 4675 Chinese science fiction novels 大约有4675本科幻小说,中文科幻小说自然语言处理语料库,中文科幻小说文本语料库,中文科幻小说文本数据库,科幻小说语料
Photographing Chinese-Address OCR implemented using CTPN+CTC+Address Correction. 拍照文档中文地址文字识别。
中文聊天机器人,基于10万组对白训练而成,采用注意力机制,对一般问题都会生成一个有意义的答复。已上传模型,可直接运行。
NLP for human. A fast and easy-to-use natural language processing (NLP) toolkit, satisfying your imagination about NLP.
PTT 八卦版問答中文語料