Neural language notes
Simple notes on papers about neural language learning from arxiv, ACL, EMNLP, NAACL, and some machine/deep learning from ICLR, ICML, NIPS. This note is inspired by Denny Britz's notes. There is also dataset for neural language research [link].
Conference Papers & Groups
- [NAACL16], [ACL16], [EMNLP16], [NIPS16], [ICML16], [ICLR16] [DLSC16, note]
- [DeepMind], [GoogleBrain], [FAIR] [AI2], [MSR]
- [CMU] [Stanford] [Berkeley] [Montreal] [UW]
2016-11
- LEARNING TO COMPOSE WORDS INTO SENTENCES WITH REINFORCEMENT LEARNING [arxiv]
- NEWSQA: A MACHINE COMPREHENSION DATASET [arxiv]
- Context-aware Natural Language Generation with Recurrent Neural Networks [arxiv]
- LEARNING FEATURES OF MUSIC FROM SCRATCH [arxiv data]
- Grammar Argumented LSTM Neural Networks with Note-Level Encoding for Music Composition [arxiv]
- PIXELVAE: A LATENT VARIABLE MODEL FOR NATURAL IMAGES [arxiv]
- VARIATIONAL LOSSY AUTOENCODER [arxiv]
- Generative Deep Neural Networks for Dialogue: A Short Review [arxiv]
- Variational Graph Auto-Encoders [arxiv]
- MODULAR MULTITASK REINFORCEMENT LEARNING WITH POLICY SKETCHES [arxiv]
- Neural Machine Translation with Reconstruction [arxiv]
- TOPICRNN: A RECURRENT NEURAL NETWORK WITH LONG-RANGE SEMANTIC DEPENDENCY [arxiv]
- REFERENCE-AWARE LANGUAGE MODELS [arxiv]
- A JOINT MANY-TASK MODEL: GROWING A NEURAL NETWORK FOR MULTIPLE NLP TASKS [arxiv]
- Ordinal Common-sense Inference [arxiv]
- Dual Learning for Machine Translation [arxiv]
- Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision [arxiv]
- What’s in an Explanation? Characterizing Knowledge and Inference Requirements for Elementary Science Exams [coling]
2016-10
- Cross-Modal Scene Networks [arxiv]
- IMPROVING SAMPLING FROM GENERATIVE AUTOENCODERS WITH MARKOV CHAINS [[arxiv[(https://arxiv.org/pdf/1610.09296v2.pdf)]
- Towards a continuous modeling of natural language domains [arxiv]
- Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification [arxiv]
- Socratic Learning [arxiv]
- Professor Forcing: A New Algorithm for Training Recurrent Networks [arxiv]
- A Paradigm for Situated and Goal-Driven Language Learning [arxiv]
- A Theme-Rewriting Approach for Generating Algebra Word Problems [arxiv]
- Lexicons and Minimum Risk Training for Neural Machine Translation: NAIST-CMU at WAT2016 [arxiv]
- Equilibrium Propagation: Bridging the Gap Between Energy-Based Models and Backpropagation [arxiv]
- Cross-Sentence Inference for Process Knowledge [emnlp]
- Learning to Translate in Real-time with Neural Machine Translation [arxiv]
- Recurrent Neural Network Grammars [arxiv]
- Connecting Generative Adversarial Networks and Actor-Critic Methods [arxiv]
- Semantic Parsing with Semi-Supervised Sequential Autoencoders [arxiv]
- Grounding the Lexical Sets of Causative-Inchoative Verbs with Word Embedding [arxiv]
- Learning to Translate in Real-time with Neural Machine Translation [axiv]
- A Dataset and Evaluation Metrics for Abstractive Compression of Sentences and Short Paragraphs [axiv]
2016-09
- SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient [arxiv]
- ReasoNet: Learning to Stop Reading in Machine Comprehension [arxiv]
- SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity [arxiv]
- NEURAL PHOTO EDITING WITH INTROSPECTIVE ADVERSARIAL NETWORKS [arxiv]
- Language as a Latent Variable: Discrete Generative Models for Sentence Compression [arxiv]
- Annotating Derivations: A New Evaluation Strategy and Dataset for Algebra Word Problems [arxiv]
- Unsupervised Neural Hidden Markov Models [arxiv]
- Creating Causal Embeddings for Question Answering with Minimal Supervision [axiv]
- Generating Videos with Scene Dynamics [arxiv]
- On the Similarities Between Native, Non-native and Translated Texts [arxiv]
- Energy-based Generative Adversarial Network [arxiv]
- Knowledge as a Teacher: Knowledge-Guided Structural Attention Networks [arxiv]
- Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks [arxiv]
- WAVENET: A GENERATIVE MODEL FOR RAW AUDIO [arxiv]
- Multimodal Attention for Neural Machine Translation [arxiv]
- Neural Machine Translation with Supervised Attention [arxiv]
- Formalizing Neurath's Ship: Approximate Algorithms for Online Causal Learning [arxiv]
- Factored Neural Machine Translation [arxiv]
- Discrete Variational Autoencoders [arxiv]
- Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads [arxiv]
- lamtram: A toolkit for language and translation modeling using neural networks [github]
- C++ neural network library [github]
- End-to-End Reinforcement Learning of Dialogue Agents for Information Access [arxiv]
- Citation Classification for Behavioral Analysis of a Scientific Field [arxiv]
- Reward Augmented Maximum Likelihood for Neural Structured Prediction [arxiv]
- All Fingers are not Equal: Intensity of References in Scientific Articles [emnlp]
- WAVENET: A GENERATIVE MODEL FOR RAW AUDIO [paper blog]
- Hierarchical Multiscale Recurrent Neural Networks [arxiv]
2016-08
- A Context-aware Natural Language Generator for Dialogue Systems [sigdial]
- Investigation Into The Effectiveness Of Long Short Term Memory Networks For Stock Price Prediction [axiv]
- HIERARCHICAL ATTENTION MODEL FOR IMPROVED MACHINE COMPREHENSION OF SPOKEN CONTENT [arxiv]
- Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks [arxiv code]
- Progressive Neural Networks [arxiv]
- Neural Variational Inference for Text Processing [arxiv]
- Generative Adversarial Text to Image Synthesis [arxiv code]
- Sequential Neural Models with Stochastic Layers [nips]
- Deep Learning without Poor Local Minima [nips]
- Actor-critic versus direct policy search: a comparison based on sample complexity [arxiv]
- Policy Networks with Two-Stage Training for Dialogue Systems [arxiv]
- Pointing the Unknown Words [acl]
- An Incremental Parser for Abstract Meaning Representation [arxiv]
- Topic Sensitive Neural Headline Generation [arxiv]
- Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine [interspeech]
- Face2Face: Real-time Face Capture and Reenactment of RGB Videos [demo]
- Image-Space Modal Bases for Plausible Manipulation of Objects in Video [demo]
- Decoupled neural interfaces using synthetic gradients [arxiv blog]
- Full Resolution Image Compression with Recurrent Neural Networks [arxiv]
- Who did What: A Large-Scale Person-Centered Cloze Dataset [arxiv data]
- Pixel Recurrent Neural Networks [arxiv]
- Mollifying Networks [arxiv]
- Variational Information Maximizing Exploration [arxiv]
- Does Multimodality Help Human and Machine for Translation and Image Captioning [arxiv]
- Learning values across many orders of magnitude [arxiv]
- Attend, Infer, Repeat: Fast Scene Understanding with Generative Models [[arxiv](Attend, Infer, Repeat: Fast Scene Understanding with Generative Models)]
- Architectural Complexity Measures of Recurrent Neural Network [arxiv]
- Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge [arxiv]
- Canonical Correlation Inference for Mapping Abstract Scenes to Text [arxiv]
- Temporal Attention Model for Neural Machine Translation [arxiv]
- Bi-directional Attention with Agreement for Dependency Parsing [arxiv]
- Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change [arxiv]
- Recurrent Highway Networks [arxiv]
- Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond [arxiv]
- WIKIREADING: A Novel Large-scale Language Understanding Task over Wikipedia [arxiv]
- Larger-Context Language Modelling with Recurrent Neural Network [acl16]
- Learning Online Alignments with Continuous Rewards Policy Gradient [arxiv]
- Issues in evaluating semantic spaces using word analogies [acl16]
- Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks [arxiv]
- Control of Memory, Active Perception, and Action in Minecraft [icml16]
- Dueling Network Architectures for Deep Reinforcement Learning [arxiv]
- Human-level control through deep reinforcement learning [nature]
- Reinforcement Learning in Multi-Party Trading Dialog [arxiv]
- Large-scale Simple Question Answering with Memory Network [arxiv]
- On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems [arxiv]
- A New Method to Visualize Deep Neural Networks [arxiv]
- Dreaming of names with RBMs [blog]
- Synthesizing Compound Words for Machine Translation [acl16]
- Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations [arxiv]
- Learning to Transduce with Unbounded Memory [arxiv]
- Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets [nip16]
- LEARNING LONGER MEMORY IN RECURRENT NEURAL NETWORKS [iclr15]
- Attention-based Multimodal Neural Machine Translation [acl16]
- A Two-stage Approach for Extending Event Detection to New Types via Neural Networks [acl16]
- Learning text representation using recurrent convolutional neural network with highway layers [arxiv]
- Training Very Deep Networks [arxiv]
- SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity [arxiv]
- Counter-fitting Word Vectors to Linguistic Constraints [naacl16]
- Learning Online Alignments with Continuous Rewards Policy Gradient [arxiv]
- NEURAL PROGRAMMER: INDUCING LATENT PROGRAMS WITH GRADIENT DESCENT [iclr16]
- Supervised Attentions for Neural Machine Translation [arxiv]
- A Neural Knowledge Language Model [arxiv]
- Recurrent Models of Visual Attention [cvpr]
- XGBoost: A Scalable Tree Boosting System [arxiv]
- A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories [naacl]
- Deep Learning Trends @ ICLR 2016 [blog]
- Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [arxiv]
- VISUALIZING AND UNDERSTANDING RECURRENT NETWORKS [iclr]
- Net2Net: ACCELERATING LEARNING VIA KNOWLEDGE TRANSFER [iclr]
- A Latent Variable Recurrent Neural Network for Discourse Relation Language Models [arxiv]
- A Recurrent Latent Variable Model for Sequential Data [arxiv]
- ORDER-EMBEDDINGS OF IMAGES AND LANGUAGE [iclr]
- Neural Module Networks [arxiv]
- Learning to Compose Neural Networks for Question Answering [acl]
- Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units [arxiv]
2016-07
- Constructing a Natural Language Inference Dataset using Generative Neural Networks [arxiv]
- ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD [arxiv]
- An Actor-Critic Algorithm for Sequence Prediction [arxiv]
- Enriching Word Vectors with Subword Information [arxiv]
- each word is represented as a bag of character n-grams in skip-gram
- Neural Machine Translation with Recurrent Attention Modeling [arxiv]
- The Role of Discourse Units in Near-Extractive Summarization [arxiv]
- Bag of Tricks for Efficient Text Classification [arxiv]
- Chains of Reasoning over Entities, Relations, and Text using Recurrent Neural Networks [arxiv]
- Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes [arxiv]
- STransE: a novel embedding model of entities and relationships in knowledge bases [naacl16]
- Layer Normalization [arxiv]
- Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering [arxiv]
- Imitation Learning with Recurrent Neural Networks [arxiv]
- Neural Name Translation Improves Neural Machine Translation [arxiv]
- query-regression networks for machine comprehension [arxiv]
- Bag of Tricks for Efficient Text Classification [arxiv]
- Chains of Reasoning over Entities, Relations, and Text using Recurrent Neural Networks [arxiv]
- Sort Story: Sorting Jumbled Images and Captions into Stories [arxiv]
- Separating Answers from Queries for Neural Reading Comprehension [arxiv]
- Recurrent Highway Networks [arxiv]
- Charagram: Embedding Words and Sentences via Character n-grams [arxiv]
- ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD [arxiv]
- Syntax-based Attention Model for Natural Language Inference [arxiv]
- Open-Vocabulary Semantic Parsing with both Distributional Statistics and Formal Knowledge [arxiv]
- Layer Normalization [arxiv]
- Neural Sentence Ordering [arxiv code]
- Distilling Word Embeddings: An Encoding Approach [arxiv]
- Target-Side Context for Discriminative Models in Statistical Machine Translation [arxiv]
- Domain Adaptation for Neural Networks by Parameter Augmentation [arxiv]
- Towards Abstraction from Extraction: Multiple Timescale Gated Recurrent Unit for Summarization [arxiv]
- Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes [arxiv]
- Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering [arxiv]
- Imitation Learning with Recurrent Neural Networks [arxiv]
- Attention-over-Attention Neural Networks for Reading Comprehension [arxiv]
- Neural Tree Indexers for Text Understanding [arxiv]
- Generating Images Part by Part with Composite Generative Adversarial Networks [arxiv]
2016-06
- Neural Summarization by Extracting Sentences and Words [arxiv]
- Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks [arxiv]
- Sequence-Level Knowledge Distillation [arxiv]
- Text Understanding with the Attention Sum Reader Network [arxiv]
- Query-Regression Networks for Machine Comprehension [arxiv]
- A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation [arxiv]
- Smart Reply: Automated Response Suggestion for Email [arxiv]
- Minimum Risk Training for Neural Machine Translation [arxiv]
- Compression of Neural Machine Translation Models via Pruning [arxiv]
- Sort Story: Sorting Jumbled Images and Captions into Stories [arxiv]
- Dialog state tracking, a machine reading approach using a memory-enhanced neural network [arxiv]
- Predicting the Relative Difficulty of Single Sentences With and Without Surrounding Context [arxiv]
- Learning Generative ConvNet with Continuous Latent Factors by Alternating Back-Propagation [arxiv]
- Topic Augmented Neural Response Generation with a Joint Attention Mechanism [arxiv]
- STransE: a novel embedding model of entities and relationships in knowledge bases [arxiv]
- Functional Distributional Semantics [arxiv]
- Sequence-Level Knowledge Distillation [arxiv]
- The LAMBADA dataset: Word prediction requiring a broad discourse context [arxiv]
- DenseCap: Fully Convolutional Localization Networks for Dense Captioning [link]
- Visualizing Dynamics: from t-SNE to SEMI-MDPs [arxiv]
- Algorithmic Composition of Melodies with Deep Recurrent Neural Networks [arxiv]
- InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets [arxiv]
- Deep Reinforcement Learning for Dialogue Generation [arxiv]
- Key-Value Memory Networks for Directly Reading Documents [arxiv]
- A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation [arxiv]
- The Word Entropy of Natural Languages [arxiv]
- Semantic Parsing to Probabilistic Programs for Situated Question Answering [arxiv]
- Critical Behavior from Deep Dynamics: A Hidden Dimension in Natural Language [arxiv]
- Inferring Logical Forms From Denotations [arxiv]
- some notes from NAACL'16 Deep Learning panel discussion
- Jacob Eisenstein made an observation, "In NLP, the things we do well on are things where context doesn't matter."
- Rationalizing Neural Predictions [arxiv]
- DeepMath - Deep Sequence Models for Premise Selection [arxiv]
- A Fast Unified Model for Parsing and Sentence Understanding [arxiv]
- A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues [arxiv]
- Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation [arxiv]
- Sequence-to-Sequence Learning as Beam-Search Optimization [arxiv]
- Tables as Semi-structured Knowledge for Question Answering [arxiv]
- Ask Me Anything: Dynamic Memory Networks for Natural Language Processing [arxiv]
- Dynamic Memory Networks for Visual and Textual Question Answering [arxiv]
- http://homes.cs.washington.edu/~nasmith/papers/flanigan+dyer+smith+carbonell.naacl16.pdf [arxiv]
- Iterative Alternating Neural Attention for Machine Reading [arxiv]
- Vector-based Models of Semantic Composition [arxiv]
- Generating Natural Language Inference Chains [arxiv]
- Learning to Compose Neural Networks for Question Answering [arxiv]
- A Latent Variable Recurrent Neural Network for Discourse Relation Language Models [arxiv]
- Data Recombination for Neural Semantic Parsing [arxiv]
- Natural Language Generation in Dialogue using Lexicalized and Delexicalized Data [arxiv]
- Deep Reinforcement Learning with a Combinatorial Action Space for Predicting and Tracking Popular Discussion Threads [arxiv]
- InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets [arxiv]
- A Diversity-Promoting Objective Function for Neural Conversation Models [arxiv]
- Neural Associative Memory for Dual-Sequence Modeling [arxiv]
- Key-Value Memory Networks for Directly Reading Documents [arxiv]
- Simple Question Answering by Attentive Convolutional Neural Network [arxiv]
- Neural Network-Based Abstract Generation for Opinions and Arguments [arxiv]
- A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task [arxiv]
- Generating Natural Questions About an Image [arxiv]
- Continuously Learning Neural Dialogue Management [arxiv]
- A Persona-Based Neural Conversation Model [arxiv]
- Deep Reinforcement Learning for Dialogue Generation [arxiv]
- A Decomposable Attention Model for Natural Language Inference [arxiv]
- attention matrixs to decompose the problem into subproblems that can be solved separately
- Memory-enhanced Decoder for Neural Machine Translation [arxiv]
- Incorporating Discrete Translation Lexicons into Neural Machine [arxiv]
- Can neural machine translation do simultaneous translation? [arxiv]
- Language to Logical Form with Neural Attention [arxiv]
- Neural Summarization by Extracting Sentences and Words [arxiv]
- Generalizing and Hybridizing Count-based and Neural Language Models [arxiv]
2016-05
- Variational Neural Machine Translation [arxiv]
- Deep Generative Models with Stick-Breaking Priors [arxiv]
- A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues [arxiv]
- One-shot Learning with Memory-Augmented Neural Networks [arxiv]
- Residual Networks are Exponential Ensembles of Relatively Shallow Networks [arxiv code]
- Modelling Interaction of Sentence Pair with coupled-LSTMs [arxiv]
- Functional Hashing for Compressing Neural Networks [arxiv]
- Combining Recurrent and Convolutional Neural Networks for Relation Classification [arxiv]
- Learning End-to-End Goal-Oriented Dialog [arxiv]
- Variational Neural Machine Translation [arxiv]
- BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings [arxiv]
- Encode, Review, and Decode: Reviewer Module for Caption Generation [arxiv]
- Automatic Extraction of Causal Relations from Natural Language Texts: A Comprehensive Survey [arxiv]
- A Convolutional Attention Network for Extreme Summarization of Source Code [arxiv]
- Data recombination for neural semantic parsing.
- Inferring logical forms from denotations
- How much is 131 million dollars? putting numbers in perspective with compositional descriptions
- Learning to Generate with Memory [arxiv]
- Attention Correctness in Neural Image Captioning [arxiv]
- Contextual LSTM (CLSTM) models for Large scale NLP tasks [arxiv]
- Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations [arxiv]
- Generative Adversarial Text to Image Synthesis [arxiv]
- Query-Efficient Imitation Learning for End-to-End Autonomous Driving [arxiv]
- Hierarchical Memory Networks [arxiv]
- odelling Interaction of Sentence Pair with coupled-LSTMs [arsiv]
- Recurrent Neural Network for Text Classification with Multi-Task Learning [arxiv]
- Rationale-Augmented Convolutional Neural Networks for Text Classification [arxiv]
- Joint Event Extraction via Recurrent Neural Networks [paper]
- Noisy Parallel Approximate Decoding for Conditional Recurrent Language Model [paper]
- Natural Language Semantics and Computability [arxiv]
- Natural Language Inference by Tree-Based Convolution and Heuristic Matching [arxiv]
- Generating Sentences from a Continuous Space [arxiv]
- Vocabulary Manipulation for Neural Machine Translation [arxiv]
- Chained Predictions Using Convolutional Neural Networks [arxiv]
- Modeling Rich Contexts for Sentiment Classification with LSTM [arxiv]
- Incorporating Selectional Preferences in Multi-hop Relation Extraction [naacl16]
- Word Ordering Without Syntax [arxiv]
- Compositional Sentence Representation from Character within Large Context Text [arxiv]
- Abstractive Sentence Summarization with Attentive Recurrent Neural Networks [arxiv]
- Mixed Incremental Cross-Entropy REINFORCE ICLR 2016 [github]
2016-04
- Towards Conceptual Compression [arxiv]]
- Teaching natural language to computers [arxiv]
- Attend, Infer, Repeat Fast Scene Understanding with Generative Models
- How NOT To Evaluate Your Dialogue System An Empirical Study of
- Revisiting Semi-Supervised Learning with Graph Embeddings
- Neural Summarization by Extracting Sentences and Words
- Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints
- LSTM-BASED DEEP LEARNING MODELS FOR NONFACTOID
- Generating Visual Explanations
- A Compositional Approach to Language Modeling [arxiv]
- Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems [arxiv]
- Building Machines That Learn and Think Like People [[arxiv](Building Machines That Learn and Think Like People)]
- A Corpus and Evaluation Framework for Deeper Understanding of Commonsense Stories [arxiv]
- Revisiting Summarization Evaluation for Scientific Articles [arxiv]
- Reasoning About Pragmatics with Neural Listeners and Speakers [arxiv]
- Character-Level Question Answering with Attention [arxiv]
- Capturing Semantic Similarity for Entity Linking with Convolutional Neural Networks [arxiv]
- Recurrent Neural Network Grammars [arxiv]
2016-03
- Neural Programmer: Inducing Latent Programs with Gradient Descent [arxiv]
- Adversarial Autoencoders
- Listen, Attend and Spell: A Neural Network for Large Vocabulary Conversational Speech Recognition
- Net2Net: Accelerating Learning via Knowledge Transfer
- Neural Programmer: Inducing Latent Programs with Gradient Descent
- A Neural Conversational Model
- Neural Language Correction with Character-Based Attention [arxiv]
- Modeling Relational Information in Question-Answer Pairs with Convolutional Neural Networks [arxiv]
- Building Machines That Learn and Think Like People [arxiv]
- LARGER-CONTEXT LANGUAGE MODELLING WITH RECURRENT NEURAL NETWORK [arxiv]
- A Diversity-Promoting Objective Function for Neural Conversation Model [arxiv]
- Hierarchical Attention Networks for Document Classification [arxiv]
- Visual Storytelling [arxiv]
- Using Sentence-Level LSTM Language Models for Script Inference [arxiv]
- ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs [arxiv]
- Character-Level Question Answering with Attention [arxiv]
- Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond [arxiv]
- Sentence Compression by Deletion with LSTMs [link]
- A Simple Way to Initialize Recurrent Networks of Rectified Linear Units [arxiv]
- DenseCap: Fully Convolutional Localization Networks for Dense Captioning [arxiv]
- Nonextensive information theoretical machine [arxiv]
- What we write about when we write about causality: Features of causal statements across large-scale social discourse [arxiv]
- Question Answering via Integer Programming over Semi-Structured Knowledge [arxiv]
- Dialog-based Language Learning [arxiv]
- Bridging LSTM Architecture and the Neural Dynamics during Reading [arxiv]
- Neural Generative Question Answering [arxiv]
- Recurrent Memory Networks for Language Modeling [arxiv]
- Colorful Image Colorization [paper] [code] [note]
TODO
- votes for papers (e.g., 👍)
- automatic crawler for citation and search counts (e.g., cite+51, tweets+42, search+523 ) like this