ner embedding classification similarity keras tensorflow lda gensim fasttext svm bert elmo word2vec crf attention

nlp journey

All implemented in tensorflow 2.0，codes

1. Basics

2. Books

Handbook of Graphical Models. online
Deep Learning. online
Neural Networks and Deep Learning. online
Speech and Language Processing. online

3. Papers

01) Transformer papers

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. paper
GPT-2: Language Models are Unsupervised Multitask Learners. paper
Transformer-XL: Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context. paper
XLNet: Generalized Autoregressive Pretraining for Language Understanding. paper
RoBERTa: Robustly Optimized BERT Pretraining Approach. paper
DistilBERT: a distilled version of BERT: smaller, faster, cheaper and lighter. paper
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. paper
T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. paper
ELECTRA: pre-training text encoders as discriminators rather than generators. paper
GPT3: Language Models are Few-Shot Learners. paper

02) Models

LSTM(Long Short-term Memory). paper
Sequence to Sequence Learning with Neural Networks. paper
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. paper
Residual Network(Deep Residual Learning for Image Recognition). paper
Dropout(Improving neural networks by preventing co-adaptation of feature detectors). paper
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. paper

03) Summaries

An overview of gradient descent optimization algorithms. paper
Analysis Methods in Neural Language Processing: A Survey. paper
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. paper
A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications. paper
A Gentle Introduction to Deep Learning for Graphs. paper
A Survey on Deep Learning for Named Entity Recognition. paper
More Data, More Relations, More Context and More Openness: A Review and Outlook for Relation Extraction. paper
Deep Learning Based Text Classification: A Comprehensive Review. paper
Pre-trained Models for Natural Language Processing: A Survey. paper
A Survey on Contextual Embeddings. paper
A Survey on Knowledge Graphs: Representation, Acquisition and Applications. paper
Knowledge Graphs. paper
Pre-trained Models for Natural Language Processing: A Survey. paper

04) Pre-training

A Neural Probabilistic Language Model. paper
word2vec Parameter Learning Explained. paper
Language Models are Unsupervised Multitask Learners. paper
An Empirical Study of Smoothing Techniques for Language Modeling. paper
Efficient Estimation of Word Representations in Vector Space. paper
Distributed Representations of Sentences and Documents. paper
Enriching Word Vectors with Subword Information(FastText). paper
GloVe: Global Vectors for Word Representation. online
ELMo (Deep contextualized word representations). paper
Pre-Training with Whole Word Masking for Chinese BERT. paper

05) Classification

Bag of Tricks for Efficient Text Classification (FastText). paper
Convolutional Neural Networks for Sentence Classification. paper
Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification. paper

06) Text generation

A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation. paper
SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient. paper

07) Text Similarity

Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks. paper
Learning Text Similarity with Siamese Recurrent Networks. paper
A Deep Architecture for Matching Short Texts. paper

08) QA

A Question-Focused Multi-Factor Attention Network for Question Answering. paper
The Design and Implementation of XiaoIce, an Empathetic Social Chatbot. paper
A Knowledge-Grounded Neural Conversation Model. paper
Neural Generative Question Answering. paper
Sequential Matching Network A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots．paper
Modeling Multi-turn Conversation with Deep Utterance Aggregation．paper
Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network．paper
Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete Attributes. paper

09) NMT

Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. paper
Neural Machine Translation by Jointly Learning to Align and Translate. paper
Transformer (Attention Is All You Need). paper

10) Summary

Get To The Point: Summarization with Pointer-Generator Networks. paper
Deep Recurrent Generative Decoder for Abstractive Text Summarization. paper

11) Relation extraction

Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks. paper
Neural Relation Extraction with Multi-lingual Attention. paper
FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation. paper
End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. paper

12) Large Language Models

Training language models to follow instructions with human feedback. paper
LLaMA: Open and Efficient Foundation Language Models. paper

4. Articles

如何学习自然语言处理（综合版）. url
TRANSFORMERS FROM SCRATCH. url
The Illustrated Transformer.url
Attention-based-model. url
Modern Deep Learning Techniques Applied to Natural Language Processing. url
难以置信！LSTM和GRU的解析从未如此清晰（动图+视频）。url
从语言模型到Seq2Seq：Transformer如戏，全靠Mask. url
Applying word2vec to Recommenders and Advertising. url
2019 NLP大全：论文、博客、教程、工程进展全梳理. url

5. Github

CLUE. github
transformers. github
HanLP. github
ML-For-Beginners. github

6. Blog

About

Documents, papers and codes related to Natural Language Processing, including Topic Model, Word Embedding, Named Entity Recognition, Text Classificatin, Text Generation, Text Similarity, Machine Translation)，etc. All codes are implemented intensorflow 2.0.

https://github.com/msgi/nlp-journey

ner embedding classification similarity keras tensorflow lda gensim fasttext svm bert elmo word2vec crf attention

Apache License 2.0

Languages

Language:Python 100.0%