KunWangR's starred repositories
nlp_xiaojiang
自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text classification), 实体提取(ner,bert+bilstm+crf),数据增强(text augment, data enhance),同义句同义词生成,句子主干提取(mainpart),中文汉语短文本相似度,文本特征工程,keras-http-service调用
search_server
一个键树做的中文|拼音搜索词服务
SearchTrie
字典树(实现简单的前缀匹配)
Pinyin2Hanzi
拼音转汉字, 拼音输入法引擎, pin yin -> 拼音
MAX-Chinese-Phonetic-Similarity-Estimator
Estimate the phonetic distance between Chinese words and get similar sounding candidate words.
pretrained-models
Open Language Pre-trained Model Zoo
SentenceSimilarity
The enhanced RCNN model used for sentence similarity classification
TextBrewer
A PyTorch-based knowledge distillation toolkit for natural language processing
awesome-bert
bert nlp papers, applications and github resources, including the newst xlnet , BERT、XLNet 相关论文和 github 项目
wsdm_cup_2020_solution
First place solution of WSDM CUP 2020, pairwise-bert, lightgbm
SequentialEventExtration
Sequential Event Experiment based on Travel note crawled from XieCheng,基于50W携程出行游记的采集与顺承事件图谱构建.
Pinyin2Chinese
Self complemented Pinyin2Chinese demo use algorithms including Trie and HMM model , 基于隐马尔科夫模型与Trie树的拼音切分与拼音转中文的简单demo实现。
pinyin2hanzi
End-to-end translation of Chinese phonetics to characters using bi-directional RNN (LSTM/GRU)
CLUEPretrainedModels
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Task-Oriented-Dialogue-Research-Progress-Survey
A datasets and methods survey about task-oriented dialogue, including recent datasets and SOTA leaderboards.
Chatbot_CN
基于金融-司法领域(兼有闲聊性质)的聊天机器人,其中的主要模块有信息抽取、NLU、NLG、知识图谱等,并且利用Django整合了前端展示,目前已经封装了nlp和kg的restful接口
LexiconAugmentedNER
Reject complicated operations for incorporating lexicon for Chinese NER.
Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
keras_to_tensorflow
General code to convert a trained keras model into an inference tensorflow model