xwzhong/papernote

recommender system

2018.07 Deep Content-User Embedding Model for Music Recommendation [arxiv] [note]
2017.08 Deep & Cross Network for Ad Click Predictions [arxiv] [note]
2017.03 DeepFM: A Factorization-Machine based Neural Network for CTR Prediction [arxiv] [note]
2016.09 Deep Neural Networks for YouTube Recommendations [research] [note]
2016.06 Wide & Deep Learning for Recommender Systems [arxiv] [note]
2010.12 Factorization Machines [ieee] [note]
1998.08 Implicit Feedback for Recommender Systems [aaai] [note]

QA & text generation

QA: retrieval-based (leaderboard):

2018.12 The Design and Implementation of XiaoIce, an Empathetic Social Chatbot [arxiv] [note]
2018.06 Modeling Multi-turn Conversation with Deep Utterance Aggregation [arxiv] [note]
2017.11 A Survey on Dialogue Systems: Recent Advances and New Frontiers [arxiv] [note]
2017.05 IRGAN [arxiv] [note]
2017.02 Bilateral Multi-Perspective Matching for Natural Language Sentences [arxiv] [note]
2016.12 A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots [arxiv] [note]
2016.11 A Compare-Aggregate Model for Matching Text Sequences [arxiv] [note]
2016.10 Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks [semanticscholar] [note]
2016.02 Attentive Pooling Networks [arxiv] [note]
2015.11 LSTM-based Deep Learning Models For Non-factoid Answer Selection [arxiv] [note]

chatbot: generation-based:

2018.04 Chat More: Deepening and Widening the Chatting Topic via A Deep Model [paper] [note]
2018.01 From Eliza to XiaoIce: Challenges and Opportunities with Social Chatbots [arxiv] [translation]
2017.11 Neural Response Generation with Dynamic Vocabularies [arxiv] [note]
2017.11 MOJITALK: Generating Emotional Responses [arxiv] [note]
2017.07 AliMe Chat: A Sequence to Sequence and Rerank based Chatbot Engine [aclweb] [note]
2017.04 Emotional Conversation Generation with Internal and External Memory [arxiv] [note]
2017.03 Learning Discourse-level Diversity for Neural Dialog Models using CVAE [arxiv] [note]
2017.02 A Knowledeg-Grounded Neural Conversation Model [arxiv] [note]
2017.01 Generating Long and Diverse Responses with Neural Conversation Models [arxiv] [note]
2016.07 Sequence to Backward and Forward Sequence [arxiv] [note]
2016.06 Topic Aware Neural Response Generation [arxiv] [note]
2016.06 Deep Reinforcement Learning for Dialogue Generation [arxiv] [note]
2015.03 Neural Responding Machine for Short-Text Conversation [arxiv] [note]

text generation

2018.06 Topic-to-Essay Generation with Neural Networks [paper] [note]
2016.10 Chinese Poetry Generation with Planning based Neural Network [arxiv] [note]
2016.03 Incorporating Copying Mechanism in Sequence-to-Sequence Learning [arxiv] [note]

classification

2019.05 How to Fine-Tune BERT for Text Classification? [arxiv] [note]
2018.06 SGM: Sequence Generation Model for Multi-Label Classification [arxiv] [note]
2018.04 ETH-DS3Lab at SemEval-2018 Task 7: ... Relation Classification and Extraction [arxiv] [note]
2017.08 Using millions of emoji occurrences to learn any-domain representations for ... [aclweb] [note]
2016.xx Attention-based LSTM for Aspect-level Sentiment Classification [aclweb] [note]
2016.07 Bag of Tricks for Efficient Text Classification(fasttext) [arxiv] [note]
2016.06 Hierarchical Attention Networks for Document Classification [aclweb] [note]
2016.03 Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks [arxiv] [note]
2015.07 Classifying Relations by Ranking with Convolutional Neural Networks [aclweb] [note]
2014.08 Convolutional Neural Networks for Sentence Classification [aclweb] [note]
2012.07 Baselines and Bigrams: Simple, Good Sentiment and Topic Classification [aclweb] [note]

embedding

word embedding:

2018.12 On the Dimensionality of Word Embedding [arxiv] [note]
2018.09 Uncovering divergent linguistic information in word embeddings ... [arxiv] [note]
2018.02 Deep contextualized word representations(ELMo) [arxiv] [note]
2017.12 Advances in Pre-Training Distributed Word Representations [arxiv] [note]
2017.07 A Simple Approach to Learn Polysemous Word Embeddings [arxiv] [note]
2017.07 Mimicking Word Embeddings using Subword RNNs [arxiv] [note]
2016.07 Enriching Word Vectors with Subword Information [arxiv] [note]
2013.01 Linguistic Regularities in Continuous Space Word Representations [aclweb] [note]

sentence embedding:

2018.09 Semi-Supervised Sequence Modeling with Cross-View Training [arxiv] [note]
2018.05 Baseline Needs More Love: On Simple Word-Embedding-Based Models and ... [arxiv] [note]
2018.04 Learning Semantic Textual Similarity from Conversations [arxiv] [note]
2018.03 An Efficient Framework for Learning Sentence Representations [arxiv] [note]
2017.05 Supervised Learning of Universal Sentence Representations from NLI Data [arxiv] [note]
2016.11 A Simple But Tough to Beat Baseline for Sentence Embeddings [openreview] [note]
2016.05 Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention [arxiv] [note]
2016.02 Learning Distributed Representations of Sentences from Unlabelled Data [arxiv] [note]
2015.12 Learning Semantic Similarity for Very Short Texts [arxiv] [note]
2015.11 Order-Embeddings of Images and Language [arxiv] [note]
2014.05 Distributed Representations of Sentences and Documents [arxiv] [note]

user embedding:

2017.05 Quantifying Mental Health from Social Media with Neural User Embeddings [arxiv] [note]

regularization & normalization

2018.08 Dropout is a special case of the stochastic delta rule: faster and more accurate deep learning [arxiv] [note]
2018.05 How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift) [arxiv] [note]
2017.02 Batch Renormalization [arxiv] [note]
2016.07 Layer Normalization [arxiv] [note]
2016.05 Adversarial Training Methods for Semi-Supervised Text Classification [arxiv] [note]
2016.03 Recurrent Batch Normalization [arxiv] [note]
2016.02 Weight Normalization [arxiv] [note]
2015.10 Batch Normalized Recurrent Neural Networks [arxiv] [note]
2015.07 Distributional Smoothing with Virtual Adversarial Training [arxiv] [note]
2015.02 Batch Normalization [arxiv] [note]
2014.12 Explaining and Harnessing Adversarial Examples [arxiv] [note]
2013.06 Regularization of Neural Networks using DropConnect [paper] [note]
2009.06 Curriculum Learning [collobert] [note]

neural network

2019.01 Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks [arxiv] [note]
2018.03 Targeted Dropout [openreview] [note]
2017.11 Attentive Language Models [aclweb] [note]
2017.04 Contextual Bidirectional Long Short-Term Memory Recurrent Neural Network Language Models [aclweb] [note]
2017.04 Learning to Generate Reviews and Discovering Sentiment [arxiv] [note]
2017.04 Exploring Sparsity in Recurrent Neural Networks [arxiv] [note]
2017.02 Deep Nets Don't Learn Via Memorization [openreview] [note]
2017.01 Dialog Context Language Modeling with Recurrent Neural Networks [arxiv] [note]
2016.11 Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling [arxiv] [note]
2016.11 Understanding Deep Learning Requires Rethinking Generalization [arxiv] [note]
2016.09 An overview of gradient descent optimization algorithms [arxiv] [note]
2016.09 Pointer Sentinel Mixture Models [arxiv] [note]
2016.08 Using the Output Embedding to Improve Language Models [arxiv] [note]
2016.03 Recurrent Dropout without Memory Loss [arxiv] [note]
2015.11 Adding Gradient Noise Improves Learning for Very Deep Networks [arxiv] [note]
2015.11 Semi-supervised Sequence Learning [arxiv] [note]
2015.06 Visualizing and Understanding Recurrent Networks [arxiv] [note]
2015.xx Calculus on Computational Graphs: Backpropagation [github] [note]
2014.12 Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling [arxiv] [note]
2014.09 Recurrent Neural Network Regularization [arxiv] [note]
2013.12 How to Construct Deep Recurrent Neural Networks [arxiv] [note]
2010.xx Understanding the difficulty of training deep feedforward neural networks [imag] [note]
2010.xx Stacked Denoising Autoencoders [paper] [note]
2008.07 A Unified Architecture for Natural Language Processing [collobert] [note]

transformer

2019.09 ALBERT: A Lite BERT for Self-supervised Learning of Language Representations [arxiv] [note]
2019.07 RoBERTa: A Robustly Optimized BERT Pretraining Approach [arxiv] [note]
2019.04 ERNIE: Enhanced Representation through Knowledge Integration [arxiv] [note]
2018.10 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [arxiv] [note]
2018.06 Improving Language Understanding by Generative Pre-Training [amazonaws] [note]
2018.03 Universal Sentence Encoder [arxiv] [note]
2017.06 Attention is All You Need [arxiv] [note]

sequence to sequence

2018.07 Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding [arxiv] [translation]
2018.07 Fluency Boost Learning and Inference for Neural Grammatical Error Correction [aclweb] [note]
2017.04 Get To The Point: Summarization with Pointer-Generator Networks [arxiv] [note]
2017.04 Learning to Skim Text [arxiv] [note]
2015.06 Pointer Networks [arxiv] [note]
2015.06 Skip-Thought Vectors [arxiv] [note]
2014.12 Grammar as a Foreign Language [arxiv] [note]
2014.12 On Using Very Large Target Vocabulary for Neural Machine Translation [arxiv] [note]
2014.09 Neural Machine Translation by Jontly Learning to Align and Translate [arxiv] [note]
2014.09 Sequence to Sequence Learning with Neural Networks [arxiv] [note]

multi task

2019.01 Multi-Task Deep Neural Networks for Natural Language Understanding [arxiv] [note]
2018.08 Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts [acm] [note]
2016.12 Overcoming catastrophic forgetting in neural networks [arxiv] [note]

name entity recognition

2018.05 Chinese NER Using Lattice LSTM [arxiv] [note]
2018.03 Neural Fine-Grained Entity Type Classification with Hierarchy-Aware Loss [arxiv] [note]
2017.04 Semi-supervised Multitask Learning for Sequence Labeling [arxiv] [note]
2016.03 Neural Architectures for Named Entity Recognition [arxiv] [note]
2016.xx Neural Architectures for Fine-grained Entity Type Classification [aclweb] [note]

self-supervised learning

2020.02 A Simple Framework for Contrastive Learning of Visual Representations [arxiv] [note]

others

2017.06 A simple neural network module for relational reasoning [arxiv] [note]
2016.11 Word or Characters, Fine-grained Gating For Reading Comprehension [arxiv] [note]
2016.08 Neural Machine Translation of Rare Words with Subword Units(BPE) [aclweb] [note]
2005.08 Personalizing Search via Automated Analysis of Interests and Activities [microsoft] [note]

QA: retrieval-based (leaderboard):

chatbot: generation-based:

text generation

word embedding:

sentence embedding:

user embedding:

others

About