Chongchen Chen's repositories
brat
brat rapid annotation tool (brat) - for all your textual annotation needs
captcha_crack
选字验证码破解,试验过网易和极验,破解率99
Chinese-Dependency-Treebank-with-Ellipsis
An Ellipsis-aware Chinese Dependency Treebank for Web Text
darknet
Convolutional Neural Networks
fastText
Library for fast text representation and classification.
libgen
Automatic Crystal bindings generator
libpostal
A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
miscellaneous_spiders
miscellaneous spiders
mordecai
Full text geoparsing as a Python library
nlp-datasets
Alphabetical list of free/public domain datasets with text data for use in Natural Language Processing (NLP)
poplar-trie
C++17 library of associative arrays with string keys based on a dynamic path-decomposed trie
pysrilm
An extremely simple Python wrapper for the SRI Language Modeling toolkit
QBASHER
Inverted file indexing and retrieval optimized for short texts. Supports auto-suggest and query segment classification.
simhash-py
Simhash and near-duplicate detection
snorkel
A system for quickly generating training data with weak supervision
sub_hash
快速求子串 hash 值
text_io
将二进制的io,转换为文本的io
tommyds
A C library of hashtables and tries designed to store objects with high performance
tre
The approximate regex matching library and agrep command line tool.
TurboPFor
Fastest Integer Compression
X-Fast-Trie
X Fast Trie, a kind of data structure. Trie = Try and Die:)