pjlintw / NLP-papers

Collecting interesting and Must-Read NLP Papers ( -2020)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Must-Read NLP Papers ( -2020)

This repository contains important NLP papers (most), well-explained materials that everyone working in the field should know about and read.

I also implements several State-of-the-art NLP models. You can find that on my repo.

Check Neural Network Language Model (NNLM)
Attention Is All You Need (Transformer)

Highlight of this repo:

  • NLP: Pretrained Language Model, Machine Translation, Text Summarization
  • CV: Image-to-image Translation
  • Learning Algorithm: Meta Learning

Index

Overview

  • Yongjun Hong, et al. How Generative Adversarial Networks and Their Variants Work: An Overview. ACM 2019. [ACM]

  • Samuel L. Smith, et al. Don't Decay the Learning Rate, Increase the Batch Size. ICLR 2018. [ICLR]

Clustering & Word Embeddings

  • Peter F Brown, et al. Class-Based n-gram Models of Natural Language. 1992. [ACL Anthology]

  • Tomas Mikolov, et al. Efficient Estimation of Word Representations in Vector Space. 2013. [ArXiv]

  • Tomas Mikolov, et al. Distributed Representations of Words and Phrases and their Compositionality. NIPS 2013. [ArXiv]

  • Quoc V. Le and Tomas Mikolov. Distributed Representations of Sentences and Documents. 2014. [ArXiv]

  • Jeffrey Pennington, et al. GloVe: Global Vectors for Word Representation. 2014. [ACL Anthology]

  • Piotr Bojanowski, et al. Enriching Word Vectors with Subword Information. 2017. [ACL Anthology]

Cross-lingual Learning

  • Junjie Hu, et al. XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization. 2020. [ArXiv]

Evaluation Metric

  • Kishore Papineni, et al. BLEU: a Method for Automatic Evaluation of Machine Translation. 2002 [CiteSeer]

  • Chin-Yew Lin. ROUGE: A Package for Automatic Evaluation of Summaries. ACL 2004. [ACL Anthology

Event Recognition

  • Amosse Edouard. Event Detection and Analysis On Short Text Messages. 2018. [ResearchGate]

  • Deepayan Chakrabarti and Kunal Punera. Event Summarization Using Tweets. ICWSM 2011. [ResearchGate]

  • Maria Vargas-Vera and David Celjuska. Event Recognition on News Stories and Semi-Automatic Population of an Ontology. Web Intelligence 2004. [ResearchGate]

Gated Recurrent Unit

  • Junyoung Chung, et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. CoRR 2014. [ArXiv]

Image Captioning

  • Steven J. Rennie, et al. Self-critical Sequence Training for Image Captioning. CVPR 2017. [ArXiv]

Image Recognition

  • An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. [ICLR]

Image-to-Image Translation

  • Jun-Yan Zhu, et al. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. ICCV 2017. [ArXiv]

  • Yunjey Choi et al. StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. CVPR 2018. [ArXiv]

  • Taesung Park, et al. Contrastive Learning for Unpaired Image-to-Image Translation. ECCV 2020. [ArXiv]

Language Modeling

  • Yoshua Bengio, et al. A Neural Probabilistic Language Model, J. of Machine Learning Research. 2003. [ACM DL]

  • Rafal Jozefowicz, et al. Exploring the Limits of Language Modeling. 2016. [ArXiv]

  • Matthew Peters, et al. Semi-supervised sequence tagging with bidirectional language models. ACL 2017. [ArXiv]

  • Matthew Peters, et al. Deep contextualized word representations. NAACL 2018. [ArXiv]

  • Jacob Devlin, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. 2018. [ArXiv]

  • Jeremy Howard and Sebastian Ruder. Universal Language Model Fine-tuning for Text Classification. ACL 2018. [ArXiv]

  • Alec Radford, et al. Improving Language Understanding by Generative Pre-Training. 2018. [OpenAI]

  • Alec Radford, et al. Language Models are Unsupervised Multitask Learners. 2019. [OpenAI]]

  • Zhenzhong Lan, et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. ICLR 2019. [OpenReview]

  • Zihang Dai, et al. Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context. ACL 2019. [ArXiv]

  • Zhilin Yang, et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding. NIPS 2019. [ArXiv]

  • Colin Raffel, et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. [ArXiv]

  • Nikita Kitaev, et al. Reformer: The Efficient Transformer. [ArXiv]

  • Kevin Clark, er al. ELECTRA_Pre-training Text Encoders as Discriminators Rather Than Generators. ICLR 2020. [ArXiv]

  • Tom B. Brown, et al. Language Models are Few-Shot Learners. 2020. [ArXiv]

  • Louis Martin, et al. CamemBERT: a Tasty French Language Model. ACL 2020. [ArXiv]

Machine Translation

  • Dzmitry Bahdanau, et al. Neural Machine Translation by Jointly Learning to Align and Translate. ICLR 2015. [ArXiv]

  • Minh-Thang Luong, et al. Effective Approaches to Attention-based Neural Machine Translation. EMNLP 2015. [ArXiv]

  • Massive Exploration of Neural Machine Translation Architectures. ACL 2017 [ArXiv]

  • Yun Chen, et al. A Teacher-Student Framework for Zero-Resource Neural Machine Translation. ACL 2017. [ArXiv]

  • Ashish Vaswani, et al. Attention Is All You Need. 2017. [ArXiv]

  • Guillaume Lample and Alexis Conneau. Cross-lingual Language Model Pretraining. 2019. [ArXiv]

  • Alexis Conneau et al. Unsupervised Cross-lingual Representation Learning at Scale. ACL 2020. [ArXiv]

  • Christos Baziotis et al. Language Model Prior for Low-Resource Neural Machine Translation. EMNLP 2020. [ArXiv]

Meta Learning

  • Chelsea Finn, et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. ICML 2017. [ArXiv]

  • Sachin Ravi and Hugo Larochelle. Optimization as a Model for Few-Shot Learning. ICLR 2017. [OpenReview]

  • Andrei A. Rusu, et al. Meta-Learning with Latent Embedding Optimization. ICLR 2019. [ArXiv]

  • Aravind Rajeswaran et al. Meta-Learning with Implicit Gradients, et al.: Meta-Learning with Implicit Gradients. NIPS 2019. [ArXiv]

Multi-Task Learning

  • Victor Sanh, et al. A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks. AAAI 2019 [ArXiv]

Named Entity Recognition

  • Guillaume Lample, et al. Neural Architectures for Named Entity Recognition. ACL 2016. [ArXiv]

  • Xuezhe Ma, Eduard Hovy. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF. ACL 2016. [ArXiv]

  • Matthew Peters, et al. Semi-Supervised Sequence Tagging With Bidirectional Language Models. ACL 2017. [ArXiv]

  • Kevin Clark, et al. Semi-Supervised Sequence Modeling with Cross-View Training. EMNLP 2018. [ArXiv]

  • Matthew Peters, et al. Deep Contextualized Word Representations. NAACL 2018. [ArXiv]

  • Abbas Ghaddar and Philippe Lannglais. Robust Lexical Features for Improved Neural Network Named-Entity Recognition. COLING 2018. [ACL Anthology]

  • Alan Akbik, et al. Contextual String Embeddings for Sequence Labeling. ACL 2018. [ResearchGate]

  • Alexei Baevski, et al. Cloze-driven Pretraining of Self-attention Networks. 2019. [ArXiv]

Probabilistic Graphical Models

  • John Lafferty, Andrew McCallum, Fernando C.N. Pereira: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. ICML 2001. [ACM DL]

Reinforcement Learning

  • Kristopher D. Asis, et al. Multi_Step Reinforcement Learning_A Unifying Algorithm. AAAI 2018. [ArXiv]

Sentence Compression

  • Thibault Fevry and Jason Phang. Unsupervised Sentence Compression using Denoising Auto-Encoders. CoNLL 2018. [ACL Anthology]

Sequence Models

  • Ilya Sutskever, et al. Sequence to Sequence Learning with Neural Networks. 2014. [ArXiv]

Text Classification

  • Yoon Kim, et al. Convolutional Neural Networks for Sentence Classification. EMNLP 2014. [ArXiv]

  • Xiang Zhang, et al. Character-Level Convolutional Networks For Text Classification. NIPS 2015. [ArXiv]

  • Yoon Kim, et al. Character-Aware Neural Language Models. AAAI 2016. [ArXiv]

  • Zichao Yang, et al. Hierarchical Attention Networks for Document Classification. NAACL 2016. [ACL Anthology]

  • Alon Jacovi, et al. Understanding Convolutional Neural Networks for Text Classification. EMNLP 2018. [ACL Anthology]

Text Generation

  • Lantao Yu, et al. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient. AAAI 2017. [ArXiv]

  • William Fedus, et al. MaskGAN: Better Text Generation via Filling in the______. ICLR 2018. [ArXiv]

  • Weili Nie, et al. RELGAN: RELATIONAL GENERATIVE ADVERSARIAL NETWORKS FOR TEXT GENERATION. ICLR 2019. [ICLR]

  • Kaitao Song, et al. MASS: Masked Sequence to Sequence Pre-Training for Langauge Generation. ICML 2019. [ArXiv]

  • Mike Lewis, et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation. Translation, and Comprehension, ACL 2020. [AxXiv]

Text Style Transfer

  • Zichao Yang, et al. Unsupervised Text Style Transfer using Language Models as Discriminators. NIPS 2018. [ArXiv]

  • Sandeep Subramanian, et al. Multiple-Attribute Text Style Transfer. ICLR 2019. [ArXiv]

Text Summarization

  • Romain Paulus, et al. A Deep Reinforced Model for Abstractive Summarization. ICLR 2018. [ArXiv]

  • Angela Fan, et al. Controllable Abstractive Summarization. ACL 2018. [ArXiv]

  • Yaushian Wang and Hung-Yi Lee. Learning to Encode Text as Human-Readable Summaries using Generative Adversarial Networks. EMNLP 2018. [ACL Anthology]

  • Peter J. Liu, et al. SummAE: Zero-Shot Abstractive Text Summarization using Length-Agnostic Auto-Encoders. 2019 [ArXiv]

  • Christos Baziotis, et al. SEQ^3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression. NAACL 2019. [ArXiv]

  • Jingqing Zhang, et al. PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. ICML 2020. [ArXiv]

About

Collecting interesting and Must-Read NLP Papers ( -2020)