danny's repositories
Algorithm_Interview_Notes-Chinese
2018/2019/校招/春招/秋招/算法/机器学习(Machine Learning)/深度学习(Deep Learning)/自然语言处理(NLP)/C/C++/Python/面试笔记
awesome-public-datasets
A topic-centric list of HQ open datasets.
awesome-python
A curated list of awesome Python frameworks, libraries, software and resources
awesome-receipt-data-extraction
A curated list (and summaries) of awesome research publications on topic of data extraction from photos of receipts.
caffe_ocr
主流ocr算法研究实验性的项目,目前实现了CNN+BLSTM+CTC架构
char-rnn-tensorflow
Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow
Chinese-Names-Corpus
中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
Chinese-PreTrained-XLNet
Pre-Trained Chinese XLNet(中文XLNet预训练模型)
DocBank
DocBank: A Benchmark Dataset for Document Layout Analysis
fastText
Library for fast text representation and classification.
featuretools
automated feature engineering
GNNPapers
Must-read papers on graph neural networks (GNN)
LaTeX_OCR
:gem: 数学公式识别
LaTeX_OCR_PRO
:art: 数学公式识别增强版:中英文手写印刷公式、支持初级符号推导(数据结构基于 LaTeX 抽象语法树)
Macropodus
自然语言处理工具Macropodus,基于Albert+BiLSTM+CRF深度学习网络架构,中文分词,词性标注,命名实体识别,新词发现,关键词,文本摘要,文本相似度,科学计算器,中文数字阿拉伯数字(罗马数字)转换,中文繁简转换,拼音转换。tookit(tool) of NLP,CWS(chinese word segnment),POS(Part-Of-Speech Tagging),NER(name entity recognition),Find(new words discovery),Keyword(keyword extraction),Summarize(text summarization),Sim(text similarity),Calculate(scientific calculator),Chi2num(chinese number to arabic number)
rnnlm
Recurrent Neural Network Language Modeling (RNNLM) Toolkit
state-of-the-art-result-for-machine-learning-problems
This repository provides state of the art (SoTA) results for all machine learning problems. We do our best to keep this repository up to date. If you do find a problem's SoTA result is out of date or missing, please raise this as an issue or submit Google form (with this information: research paper name, dataset, metric, source code and year). We will fix it immediately.
tesseract
Tesseract Open Source OCR Engine (main repository)
TimeSformer-pytorch
Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classification
transferlearning-tutorial
《迁移学习简明手册》LaTex源码
transformers
🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.
tsfresh
Automatic extraction of relevant features from time series:
word-rnn-tensorflow
Multi-layer Recurrent Neural Networks (LSTM, RNN) for word-level language models in Python using TensorFlow.
word2vec_commented
Commented (but unaltered) version of original word2vec C implementation.