Zhengxiang Wang's repositories
Notes-for-Stanford-CS224N-NLP-with-Deep-Learning
Notes for Stanford CS224N: Natural Language Processing with Deep Learning.
ChineseNLPCorpus
中文自然语言处理数据集,平时做做实验的材料。欢迎补充提交合并。
Chinese-Synonyms
A large high-quality corpus of Chinese synonyms 一个大型、高质量的中文同义词语料库。
gender-predictor
Predicting gender of given Chinese names (93~99% test set accuracy). 预测中文姓名的性别(93~99%的测试集准确率)。
ChineseNgrams
Chinese Mandarin Ngrams Counts from large-scale corpora
Chinese-fixed-phrases-idioms
A large corpus of Chinese fixed phrases and idioms scraped from a reputable educational website (30310 instances). 一个大型的中文成语及俗语语料库,内含30310条语例
HELPtk
Historical English Language Processing Toolkit: An efficient toolkit and a general framework for early modern & modern English Language Processing in XML and much more. With just a few lines of code and a few minutes, it can tokenize, normalize & annotate a normal XML corpus of a few million tokens. Besides, it is also easy to adapt.
linguistic-knowledge-in-DA-for-NLP
Source Code, data, and results for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chinese Question Matching.
dl-nlp-using-paddlenlp
notes on paddlenlp, a SOTA deep learning based NLP toolkit
GSchoolarAnalyzer
Auto-aggregating academic profiles of researchers on Google Scholar.
text-classification-explained
Building and training deep learning models for text classification tasks from scratch using paddle, PyTorch, and TensorFlow.
BASE_SLN_Pause_Project
This repository stores Python scripts created for BASE slient pause project.
text-matching-explained
Building and training deep learning models for text matching classification tasks from scratch using paddle, PyTorch, and TensorFlow.
rnn-seq2seq-learning
RNN seq2seq models learning transductions and alignments
rnn-transduction
Using RNNs in modelling transduction tasks. Customized training and inference pipelines are provided.
competitions_logs
Competition records of mine, mostly related to NLP.
d2l-en
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 200 universities.
detect-influence-campaigns
Clustering Document Parts: Detecting and Characterizing Influence Campaigns from Documents
hands-on-gradients-derivation-for-ml-dl-loss-func
Hands on gradients derivations for common supervised machine learning and deep learning loss functions.
Lstar_Python
Python Implementation of the Lstar Algorithm by Angluin (1987)
Misc_Notes
Miscellaneous notes in the past that are in a good shape 过去一些保存比较完整的学习笔记
PyTorch_Tutorial
Simple PyTorch Tutorial for a guest lecture I gave. Suitable for beginners.
re-implementation-of-subreg-deeplearning
Re-implementations (code & results) of the paper "Subregular Complexity and Deep Learning", which disagree with the original observations.
rnn-seq2seq-transduction
Using RNN seq2seq models in modelling transduction tasks. Customized training and inference pipelines are provided.
text-augmentation-techniques
Common approaches to text augmentation, from random text-editing perturbations, back translation, to model-based transformations.
word_guesser_for_wordle-like_word_games
A simple and easy-to-use program to help you play Wordle-like word games. 一个简单好用的程序,应用于Wordle类猜词游戏。