KunWangR's starred repositories

nlp_xiaojiang

自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text classification), 实体提取(ner,bert+bilstm+crf),数据增强(text augment, data enhance),同义句同义词生成,句子主干提取(mainpart),中文汉语短文本相似度,文本特征工程,keras-http-service调用

Language:PythonLicense:MITStargazers:1515Issues:0Issues:0

rank_bm25

A Collection of BM25 Algorithms in Python

Language:PythonLicense:Apache-2.0Stargazers:918Issues:0Issues:0

search_server

一个键树做的中文|拼音搜索词服务

Language:PythonStargazers:1Issues:0Issues:0

SearchTrie

字典树(实现简单的前缀匹配)

Language:JavaLicense:MITStargazers:7Issues:0Issues:0

pinyin4py

汉字转拼音

Language:PythonStargazers:42Issues:0Issues:0

Pinyin2Hanzi

拼音转汉字, 拼音输入法引擎, pin yin -> 拼音

Language:PythonStargazers:580Issues:0Issues:0

MAX-Chinese-Phonetic-Similarity-Estimator

Estimate the phonetic distance between Chinese words and get similar sounding candidate words.

Language:PythonLicense:Apache-2.0Stargazers:34Issues:0Issues:0

faiss

A library for efficient similarity search and clustering of dense vectors.

Language:C++License:MITStargazers:29438Issues:0Issues:0

pretrained-models

Open Language Pre-trained Model Zoo

License:Apache-2.0Stargazers:984Issues:0Issues:0

SentenceSimilarity

The enhanced RCNN model used for sentence similarity classification

Language:PythonStargazers:43Issues:0Issues:0

fastNLP

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

Language:PythonLicense:Apache-2.0Stargazers:3046Issues:0Issues:0

TextBrewer

A PyTorch-based knowledge distillation toolkit for natural language processing

Language:PythonLicense:Apache-2.0Stargazers:1568Issues:0Issues:0

awesome-bert

bert nlp papers, applications and github resources, including the newst xlnet , BERT、XLNet 相关论文和 github 项目

Stargazers:1841Issues:0Issues:0
Language:PythonStargazers:278Issues:0Issues:0

text2vec

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。

Language:PythonLicense:Apache-2.0Stargazers:4259Issues:0Issues:0

wsdm_cup_2020_solution

First place solution of WSDM CUP 2020, pairwise-bert, lightgbm

Language:PythonStargazers:89Issues:0Issues:0

SequentialEventExtration

Sequential Event Experiment based on Travel note crawled from XieCheng,基于50W携程出行游记的采集与顺承事件图谱构建.

Language:PythonStargazers:174Issues:0Issues:0

Pinyin2Chinese

Self complemented Pinyin2Chinese demo use algorithms including Trie and HMM model , 基于隐马尔科夫模型与Trie树的拼音切分与拼音转中文的简单demo实现。

Language:PythonStargazers:82Issues:0Issues:0

MPyWE

Morpheme, Pinyin Enhanced Word Embedding

Language:PythonStargazers:3Issues:0Issues:0

pinyin2hanzi

End-to-end translation of Chinese phonetics to characters using bi-directional RNN (LSTM/GRU)

Language:PythonStargazers:27Issues:0Issues:0

CLUEPretrainedModels

高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型

Language:PythonStargazers:793Issues:0Issues:0

Task-Oriented-Dialogue-Research-Progress-Survey

A datasets and methods survey about task-oriented dialogue, including recent datasets and SOTA leaderboards.

Stargazers:1240Issues:0Issues:0

Chatbot_CN

基于金融-司法领域(兼有闲聊性质)的聊天机器人,其中的主要模块有信息抽取、NLU、NLG、知识图谱等,并且利用Django整合了前端展示,目前已经封装了nlp和kg的restful接口

License:Apache-2.0Stargazers:1273Issues:0Issues:0

LexiconAugmentedNER

Reject complicated operations for incorporating lexicon for Chinese NER.

Language:PythonStargazers:432Issues:0Issues:0

IRGAN

IRGAN: GAN for IR, SIGIR 2017, Thesis Introduction

Language:Jupyter NotebookStargazers:9Issues:0Issues:0

IRGAN

IRGAN for QA

Language:PythonStargazers:1Issues:0Issues:0
Language:PythonStargazers:8Issues:0Issues:0

Jay_KG

周杰伦歌曲信息的知识图谱问答系统

Language:PythonStargazers:133Issues:0Issues:0

Pretrained-Language-Model

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Language:PythonStargazers:2990Issues:0Issues:0

keras_to_tensorflow

General code to convert a trained keras model into an inference tensorflow model

Language:PythonLicense:MITStargazers:1666Issues:0Issues:0