Berlin_hsin's repositories
classification-and-cluster
Python version for KMeans , KNN and Hierarchical clustering .
PriorityTrie
PriorityTrie
conceptnet5
Code for building ConceptNet from raw data.
HanLP
自然语言处理 中文分词 词性标注 命名实体识别 依存句法分析 关键词提取 新词发现 短语提取 自动摘要 文本分类 拼音简繁
NLP-progress
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
PyTorchText
1st Place Solution for Zhihu Machine Learning Challenge . Implementation of various text-classification models.(知乎看山杯第一名解决方案)
sketchat
The project of online discussion room.
spark
Mirror of Apache Spark
topical_word_embeddings
A demo code for topical word embedding
unicodeSymbol
maintain a list of symbol in Chinese word which should be replace in text mining.
Vigen-re_python
A simple decoder and encoder on Vigenère
xgoogle
Python library to Google services (google search, google sets, google translate, sponsored links)