peng cheng's repositories
NameEntityRecognition
Extract address
aho-corasick
Aho-Corasick的Java实现,针对Ascii优化,支持Unicode。
ansj_fast_lda
LDA 的java实现
cnn-text-classification-tf
Convolutional Neural Network for Text Classification in Tensorflow
CNN-Text-Pairs-Classification
About Text Pairs (Sentence Level) Classification (Similarity Modeling) Based on CNN.
deep-siamese-text-similarity
Tensorflow based implementation of deep siamese LSTM network to capture phrase/sentence similarity using character/word embeddings
dgk_lost_conv
dgk_lost_conv 中文对白语料 chinese conversation corpus
Dialog_Corpus
用于训练中英文对话系统的语料库 Datasets for Training Chatbot System
DocumentClassification
This code implements a simple CNN model for document classification with tensorflow.
insuranceqa-corpus-zh
OpenData in insurance area for Machine Learning Tasks
multi-class-text-classification-cnn-rnn
Classify Kaggle San Francisco Crime Description into 39 classes. Build the model with CNN, RNN (GRU and LSTM) and Word Embeddings on Tensorflow.
sif-java
Implementation of ICLR 2017 "sentence embedding by Smooth Inverse Frequency weighting scheme" in Java.
SkewJoin
hadoop data skew join optimization
text_classification
all kinds of text classificaiton models and more with deep learning
vector-search-plugin
Elasticsearch plugin for fast nearest neighbours of vectors (Similar use as FAISS)