dykang / neurallanguage-notes

summaries and notes on neural language learning papers (deprecated)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Neural language notes

Simple notes on papers about neural language learning from arxiv, ACL, EMNLP, NAACL, and some machine/deep learning from ICLR, ICML, NIPS. This note is inspired by Denny Britz's notes. There is also dataset for neural language research [link].

Conference Papers & Groups

2016-11

  • LEARNING TO COMPOSE WORDS INTO SENTENCES WITH REINFORCEMENT LEARNING [arxiv]
  • NEWSQA: A MACHINE COMPREHENSION DATASET [arxiv]
  • Context-aware Natural Language Generation with Recurrent Neural Networks [arxiv]
  • LEARNING FEATURES OF MUSIC FROM SCRATCH [arxiv data]
  • Grammar Argumented LSTM Neural Networks with Note-Level Encoding for Music Composition [arxiv]
  • PIXELVAE: A LATENT VARIABLE MODEL FOR NATURAL IMAGES [arxiv]
  • VARIATIONAL LOSSY AUTOENCODER [arxiv]
  • Generative Deep Neural Networks for Dialogue: A Short Review [arxiv]
  • Variational Graph Auto-Encoders [arxiv]
  • MODULAR MULTITASK REINFORCEMENT LEARNING WITH POLICY SKETCHES [arxiv]
  • Neural Machine Translation with Reconstruction [arxiv]
  • TOPICRNN: A RECURRENT NEURAL NETWORK WITH LONG-RANGE SEMANTIC DEPENDENCY [arxiv]
  • REFERENCE-AWARE LANGUAGE MODELS [arxiv]
  • A JOINT MANY-TASK MODEL: GROWING A NEURAL NETWORK FOR MULTIPLE NLP TASKS [arxiv]
  • Ordinal Common-sense Inference [arxiv]
  • Dual Learning for Machine Translation [arxiv]
  • Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision [arxiv]
  • What’s in an Explanation? Characterizing Knowledge and Inference Requirements for Elementary Science Exams [coling]

2016-10

  • Cross-Modal Scene Networks [arxiv]
  • IMPROVING SAMPLING FROM GENERATIVE AUTOENCODERS WITH MARKOV CHAINS [[arxiv[(https://arxiv.org/pdf/1610.09296v2.pdf)]
  • Towards a continuous modeling of natural language domains [arxiv]
  • Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification [arxiv]
  • Socratic Learning [arxiv]
  • Professor Forcing: A New Algorithm for Training Recurrent Networks [arxiv]
  • A Paradigm for Situated and Goal-Driven Language Learning [arxiv]
  • A Theme-Rewriting Approach for Generating Algebra Word Problems [arxiv]
  • Lexicons and Minimum Risk Training for Neural Machine Translation: NAIST-CMU at WAT2016 [arxiv]
  • Equilibrium Propagation: Bridging the Gap Between Energy-Based Models and Backpropagation [arxiv]
  • Cross-Sentence Inference for Process Knowledge [emnlp]
  • Learning to Translate in Real-time with Neural Machine Translation [arxiv]
  • Recurrent Neural Network Grammars [arxiv]
  • Connecting Generative Adversarial Networks and Actor-Critic Methods [arxiv]
  • Semantic Parsing with Semi-Supervised Sequential Autoencoders [arxiv]
  • Grounding the Lexical Sets of Causative-Inchoative Verbs with Word Embedding [arxiv]
  • Learning to Translate in Real-time with Neural Machine Translation [axiv]
  • A Dataset and Evaluation Metrics for Abstractive Compression of Sentences and Short Paragraphs [axiv]

2016-09

  • SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient [arxiv]
  • ReasoNet: Learning to Stop Reading in Machine Comprehension [arxiv]
  • SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity [arxiv]
  • NEURAL PHOTO EDITING WITH INTROSPECTIVE ADVERSARIAL NETWORKS [arxiv]
  • Language as a Latent Variable: Discrete Generative Models for Sentence Compression [arxiv]
  • Annotating Derivations: A New Evaluation Strategy and Dataset for Algebra Word Problems [arxiv]
  • Unsupervised Neural Hidden Markov Models [arxiv]
  • Creating Causal Embeddings for Question Answering with Minimal Supervision [axiv]
  • Generating Videos with Scene Dynamics [arxiv]
  • On the Similarities Between Native, Non-native and Translated Texts [arxiv]
  • Energy-based Generative Adversarial Network [arxiv]
  • Knowledge as a Teacher: Knowledge-Guided Structural Attention Networks [arxiv]
  • Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks [arxiv]
  • WAVENET: A GENERATIVE MODEL FOR RAW AUDIO [arxiv]
  • Multimodal Attention for Neural Machine Translation [arxiv]
  • Neural Machine Translation with Supervised Attention [arxiv]
  • Formalizing Neurath's Ship: Approximate Algorithms for Online Causal Learning [arxiv]
  • Factored Neural Machine Translation [arxiv]
  • Discrete Variational Autoencoders [arxiv]
  • Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads [arxiv]
  • lamtram: A toolkit for language and translation modeling using neural networks [github]
  • C++ neural network library [github]
  • End-to-End Reinforcement Learning of Dialogue Agents for Information Access [arxiv]
  • Citation Classification for Behavioral Analysis of a Scientific Field [arxiv]
  • Reward Augmented Maximum Likelihood for Neural Structured Prediction [arxiv]
  • All Fingers are not Equal: Intensity of References in Scientific Articles [emnlp]
  • WAVENET: A GENERATIVE MODEL FOR RAW AUDIO [paper blog]
  • Hierarchical Multiscale Recurrent Neural Networks [arxiv]

2016-08

  • A Context-aware Natural Language Generator for Dialogue Systems [sigdial]
  • Investigation Into The Effectiveness Of Long Short Term Memory Networks For Stock Price Prediction [axiv]
  • HIERARCHICAL ATTENTION MODEL FOR IMPROVED MACHINE COMPREHENSION OF SPOKEN CONTENT [arxiv]
  • Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks [arxiv code]
  • Progressive Neural Networks [arxiv]
  • Neural Variational Inference for Text Processing [arxiv]
  • Generative Adversarial Text to Image Synthesis [arxiv code]
  • Sequential Neural Models with Stochastic Layers [nips]
  • Deep Learning without Poor Local Minima [nips]
  • Actor-critic versus direct policy search: a comparison based on sample complexity [arxiv]
  • Policy Networks with Two-Stage Training for Dialogue Systems [arxiv]
  • Pointing the Unknown Words [acl]
  • An Incremental Parser for Abstract Meaning Representation [arxiv]
  • Topic Sensitive Neural Headline Generation [arxiv]
  • Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine [interspeech]
  • Face2Face: Real-time Face Capture and Reenactment of RGB Videos [demo]
  • Image-Space Modal Bases for Plausible Manipulation of Objects in Video [demo]
  • Decoupled neural interfaces using synthetic gradients [arxiv blog]
  • Full Resolution Image Compression with Recurrent Neural Networks [arxiv]
  • Who did What: A Large-Scale Person-Centered Cloze Dataset [arxiv data]
  • Pixel Recurrent Neural Networks [arxiv]
  • Mollifying Networks [arxiv]
  • Variational Information Maximizing Exploration [arxiv]
  • Does Multimodality Help Human and Machine for Translation and Image Captioning [arxiv]
  • Learning values across many orders of magnitude [arxiv]
  • Attend, Infer, Repeat: Fast Scene Understanding with Generative Models [[arxiv](Attend, Infer, Repeat: Fast Scene Understanding with Generative Models)]
  • Architectural Complexity Measures of Recurrent Neural Network [arxiv]
  • Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge [arxiv]
  • Canonical Correlation Inference for Mapping Abstract Scenes to Text [arxiv]
  • Temporal Attention Model for Neural Machine Translation [arxiv]
  • Bi-directional Attention with Agreement for Dependency Parsing [arxiv]
  • Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change [arxiv]
  • Recurrent Highway Networks [arxiv]
  • Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond [arxiv]
  • WIKIREADING: A Novel Large-scale Language Understanding Task over Wikipedia [arxiv]
  • Larger-Context Language Modelling with Recurrent Neural Network [acl16]
  • Learning Online Alignments with Continuous Rewards Policy Gradient [arxiv]
  • Issues in evaluating semantic spaces using word analogies [acl16]
  • Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks [arxiv]
  • Control of Memory, Active Perception, and Action in Minecraft [icml16]
  • Dueling Network Architectures for Deep Reinforcement Learning [arxiv]
  • Human-level control through deep reinforcement learning [nature]
  • Reinforcement Learning in Multi-Party Trading Dialog [arxiv]
  • Large-scale Simple Question Answering with Memory Network [arxiv]
  • On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems [arxiv]
  • A New Method to Visualize Deep Neural Networks [arxiv]
  • Dreaming of names with RBMs [blog]
  • Synthesizing Compound Words for Machine Translation [acl16]
  • Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations [arxiv]
  • Learning to Transduce with Unbounded Memory [arxiv]
  • Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets [nip16]
  • LEARNING LONGER MEMORY IN RECURRENT NEURAL NETWORKS [iclr15]
  • Attention-based Multimodal Neural Machine Translation [acl16]
  • A Two-stage Approach for Extending Event Detection to New Types via Neural Networks [acl16]
  • Learning text representation using recurrent convolutional neural network with highway layers [arxiv]
  • Training Very Deep Networks [arxiv]
  • SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity [arxiv]
  • Counter-fitting Word Vectors to Linguistic Constraints [naacl16]
  • Learning Online Alignments with Continuous Rewards Policy Gradient [arxiv]
  • NEURAL PROGRAMMER: INDUCING LATENT PROGRAMS WITH GRADIENT DESCENT [iclr16]
  • Supervised Attentions for Neural Machine Translation [arxiv]
  • A Neural Knowledge Language Model [arxiv]
  • Recurrent Models of Visual Attention [cvpr]
  • XGBoost: A Scalable Tree Boosting System [arxiv]
  • A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories [naacl]
  • Deep Learning Trends @ ICLR 2016 [blog]
  • Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [arxiv]
  • VISUALIZING AND UNDERSTANDING RECURRENT NETWORKS [iclr]
  • Net2Net: ACCELERATING LEARNING VIA KNOWLEDGE TRANSFER [iclr]
  • A Latent Variable Recurrent Neural Network for Discourse Relation Language Models [arxiv]
  • A Recurrent Latent Variable Model for Sequential Data [arxiv]
  • ORDER-EMBEDDINGS OF IMAGES AND LANGUAGE [iclr]
  • Neural Module Networks [arxiv]
  • Learning to Compose Neural Networks for Question Answering [acl]
  • Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units [arxiv]

2016-07

  • Constructing a Natural Language Inference Dataset using Generative Neural Networks [arxiv]
  • ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD [arxiv]
  • An Actor-Critic Algorithm for Sequence Prediction [arxiv]
  • Enriching Word Vectors with Subword Information [arxiv]
  • each word is represented as a bag of character n-grams in skip-gram
  • Neural Machine Translation with Recurrent Attention Modeling [arxiv]
  • The Role of Discourse Units in Near-Extractive Summarization [arxiv]
  • Bag of Tricks for Efficient Text Classification [arxiv]
  • Chains of Reasoning over Entities, Relations, and Text using Recurrent Neural Networks [arxiv]
  • Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes [arxiv]
  • STransE: a novel embedding model of entities and relationships in knowledge bases [naacl16]
  • Layer Normalization [arxiv]
  • Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering [arxiv]
  • Imitation Learning with Recurrent Neural Networks [arxiv]
  • Neural Name Translation Improves Neural Machine Translation [arxiv]
  • query-regression networks for machine comprehension [arxiv]
  • Bag of Tricks for Efficient Text Classification [arxiv]
  • Chains of Reasoning over Entities, Relations, and Text using Recurrent Neural Networks [arxiv]
  • Sort Story: Sorting Jumbled Images and Captions into Stories [arxiv]
  • Separating Answers from Queries for Neural Reading Comprehension [arxiv]
  • Recurrent Highway Networks [arxiv]
  • Charagram: Embedding Words and Sentences via Character n-grams [arxiv]
  • ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD [arxiv]
  • Syntax-based Attention Model for Natural Language Inference [arxiv]
  • Open-Vocabulary Semantic Parsing with both Distributional Statistics and Formal Knowledge [arxiv]
  • Layer Normalization [arxiv]
  • Neural Sentence Ordering [arxiv code]
  • Distilling Word Embeddings: An Encoding Approach [arxiv]
  • Target-Side Context for Discriminative Models in Statistical Machine Translation [arxiv]
  • Domain Adaptation for Neural Networks by Parameter Augmentation [arxiv]
  • Towards Abstraction from Extraction: Multiple Timescale Gated Recurrent Unit for Summarization [arxiv]
  • Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes [arxiv]
  • Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering [arxiv]
  • Imitation Learning with Recurrent Neural Networks [arxiv]
  • Attention-over-Attention Neural Networks for Reading Comprehension [arxiv]
  • Neural Tree Indexers for Text Understanding [arxiv]
  • Generating Images Part by Part with Composite Generative Adversarial Networks [arxiv]

2016-06

  • Neural Summarization by Extracting Sentences and Words [arxiv]
  • Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks [arxiv]
  • Sequence-Level Knowledge Distillation [arxiv]
  • Text Understanding with the Attention Sum Reader Network [arxiv]
  • Query-Regression Networks for Machine Comprehension [arxiv]
  • A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation [arxiv]
  • Smart Reply: Automated Response Suggestion for Email [arxiv]
  • Minimum Risk Training for Neural Machine Translation [arxiv]
  • Compression of Neural Machine Translation Models via Pruning [arxiv]
  • Sort Story: Sorting Jumbled Images and Captions into Stories [arxiv]
  • Dialog state tracking, a machine reading approach using a memory-enhanced neural network [arxiv]
  • Predicting the Relative Difficulty of Single Sentences With and Without Surrounding Context [arxiv]
  • Learning Generative ConvNet with Continuous Latent Factors by Alternating Back-Propagation [arxiv]
  • Topic Augmented Neural Response Generation with a Joint Attention Mechanism [arxiv]
  • STransE: a novel embedding model of entities and relationships in knowledge bases [arxiv]
  • Functional Distributional Semantics [arxiv]
  • Sequence-Level Knowledge Distillation [arxiv]
  • The LAMBADA dataset: Word prediction requiring a broad discourse context [arxiv]
  • DenseCap: Fully Convolutional Localization Networks for Dense Captioning [link]
  • Visualizing Dynamics: from t-SNE to SEMI-MDPs [arxiv]
  • Algorithmic Composition of Melodies with Deep Recurrent Neural Networks [arxiv]
  • InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets [arxiv]
  • Deep Reinforcement Learning for Dialogue Generation [arxiv]
  • Key-Value Memory Networks for Directly Reading Documents [arxiv]
  • A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation [arxiv]
  • The Word Entropy of Natural Languages [arxiv]
  • Semantic Parsing to Probabilistic Programs for Situated Question Answering [arxiv]
  • Critical Behavior from Deep Dynamics: A Hidden Dimension in Natural Language [arxiv]
  • Inferring Logical Forms From Denotations [arxiv]
  • some notes from NAACL'16 Deep Learning panel discussion
  • Jacob Eisenstein made an observation, "In NLP, the things we do well on are things where context doesn't matter."
  • Rationalizing Neural Predictions [arxiv]
  • DeepMath - Deep Sequence Models for Premise Selection [arxiv]
  • A Fast Unified Model for Parsing and Sentence Understanding [arxiv]
  • A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues [arxiv]
  • Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation [arxiv]
  • Sequence-to-Sequence Learning as Beam-Search Optimization [arxiv]
  • Tables as Semi-structured Knowledge for Question Answering [arxiv]
  • Ask Me Anything: Dynamic Memory Networks for Natural Language Processing [arxiv]
  • Dynamic Memory Networks for Visual and Textual Question Answering [arxiv]
  • http://homes.cs.washington.edu/~nasmith/papers/flanigan+dyer+smith+carbonell.naacl16.pdf [arxiv]
  • Iterative Alternating Neural Attention for Machine Reading [arxiv]
  • Vector-based Models of Semantic Composition [arxiv]
  • Generating Natural Language Inference Chains [arxiv]
  • Learning to Compose Neural Networks for Question Answering [arxiv]
  • A Latent Variable Recurrent Neural Network for Discourse Relation Language Models [arxiv]
  • Data Recombination for Neural Semantic Parsing [arxiv]
  • Natural Language Generation in Dialogue using Lexicalized and Delexicalized Data [arxiv]
  • Deep Reinforcement Learning with a Combinatorial Action Space for Predicting and Tracking Popular Discussion Threads [arxiv]
  • InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets [arxiv]
  • A Diversity-Promoting Objective Function for Neural Conversation Models [arxiv]
  • Neural Associative Memory for Dual-Sequence Modeling [arxiv]
  • Key-Value Memory Networks for Directly Reading Documents [arxiv]
  • Simple Question Answering by Attentive Convolutional Neural Network [arxiv]
  • Neural Network-Based Abstract Generation for Opinions and Arguments [arxiv]
  • A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task [arxiv]
  • Generating Natural Questions About an Image [arxiv]
  • Continuously Learning Neural Dialogue Management [arxiv]
  • A Persona-Based Neural Conversation Model [arxiv]
  • Deep Reinforcement Learning for Dialogue Generation [arxiv]
  • A Decomposable Attention Model for Natural Language Inference [arxiv]
  • attention matrixs to decompose the problem into subproblems that can be solved separately
  • Memory-enhanced Decoder for Neural Machine Translation [arxiv]
  • Incorporating Discrete Translation Lexicons into Neural Machine [arxiv]
  • Can neural machine translation do simultaneous translation? [arxiv]
  • Language to Logical Form with Neural Attention [arxiv]
  • Neural Summarization by Extracting Sentences and Words [arxiv]
  • Generalizing and Hybridizing Count-based and Neural Language Models [arxiv]

2016-05

  • Variational Neural Machine Translation [arxiv]
  • Deep Generative Models with Stick-Breaking Priors [arxiv]
  • A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues [arxiv]
  • One-shot Learning with Memory-Augmented Neural Networks [arxiv]
  • Residual Networks are Exponential Ensembles of Relatively Shallow Networks [arxiv code]
  • Modelling Interaction of Sentence Pair with coupled-LSTMs [arxiv]
  • Functional Hashing for Compressing Neural Networks [arxiv]
  • Combining Recurrent and Convolutional Neural Networks for Relation Classification [arxiv]
  • Learning End-to-End Goal-Oriented Dialog [arxiv]
  • Variational Neural Machine Translation [arxiv]
  • BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings [arxiv]
  • Encode, Review, and Decode: Reviewer Module for Caption Generation [arxiv]
  • Automatic Extraction of Causal Relations from Natural Language Texts: A Comprehensive Survey [arxiv]
  • A Convolutional Attention Network for Extreme Summarization of Source Code [arxiv]
  • Data recombination for neural semantic parsing.
  • Inferring logical forms from denotations
  • How much is 131 million dollars? putting numbers in perspective with compositional descriptions
  • Learning to Generate with Memory [arxiv]
  • Attention Correctness in Neural Image Captioning [arxiv]
  • Contextual LSTM (CLSTM) models for Large scale NLP tasks [arxiv]
  • Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations [arxiv]
  • Generative Adversarial Text to Image Synthesis [arxiv]
  • Query-Efficient Imitation Learning for End-to-End Autonomous Driving [arxiv]
  • Hierarchical Memory Networks [arxiv]
  • odelling Interaction of Sentence Pair with coupled-LSTMs [arsiv]
  • Recurrent Neural Network for Text Classification with Multi-Task Learning [arxiv]
  • Rationale-Augmented Convolutional Neural Networks for Text Classification [arxiv]
  • Joint Event Extraction via Recurrent Neural Networks [paper]
  • Noisy Parallel Approximate Decoding for Conditional Recurrent Language Model [paper]
  • Natural Language Semantics and Computability [arxiv]
  • Natural Language Inference by Tree-Based Convolution and Heuristic Matching [arxiv]
  • Generating Sentences from a Continuous Space [arxiv]
  • Vocabulary Manipulation for Neural Machine Translation [arxiv]
  • Chained Predictions Using Convolutional Neural Networks [arxiv]
  • Modeling Rich Contexts for Sentiment Classification with LSTM [arxiv]
  • Incorporating Selectional Preferences in Multi-hop Relation Extraction [naacl16]
  • Word Ordering Without Syntax [arxiv]
  • Compositional Sentence Representation from Character within Large Context Text [arxiv]
  • Abstractive Sentence Summarization with Attentive Recurrent Neural Networks [arxiv]
  • Mixed Incremental Cross-Entropy REINFORCE ICLR 2016 [github]

2016-04

  • Towards Conceptual Compression [arxiv]]
  • Teaching natural language to computers [arxiv]
  • Attend, Infer, Repeat Fast Scene Understanding with Generative Models
  • How NOT To Evaluate Your Dialogue System An Empirical Study of
  • Revisiting Semi-Supervised Learning with Graph Embeddings
  • Neural Summarization by Extracting Sentences and Words
  • Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints
  • LSTM-BASED DEEP LEARNING MODELS FOR NONFACTOID
  • Generating Visual Explanations
  • A Compositional Approach to Language Modeling [arxiv]
  • Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems [arxiv]
  • Building Machines That Learn and Think Like People [[arxiv](Building Machines That Learn and Think Like People)]
  • A Corpus and Evaluation Framework for Deeper Understanding of Commonsense Stories [arxiv]
  • Revisiting Summarization Evaluation for Scientific Articles [arxiv]
  • Reasoning About Pragmatics with Neural Listeners and Speakers [arxiv]
  • Character-Level Question Answering with Attention [arxiv]
  • Capturing Semantic Similarity for Entity Linking with Convolutional Neural Networks [arxiv]
  • Recurrent Neural Network Grammars [arxiv]

2016-03

  • Neural Programmer: Inducing Latent Programs with Gradient Descent [arxiv]
  • Adversarial Autoencoders
  • Listen, Attend and Spell: A Neural Network for Large Vocabulary Conversational Speech Recognition
  • Net2Net: Accelerating Learning via Knowledge Transfer
  • Neural Programmer: Inducing Latent Programs with Gradient Descent
  • A Neural Conversational Model
  • Neural Language Correction with Character-Based Attention [arxiv]
  • Modeling Relational Information in Question-Answer Pairs with Convolutional Neural Networks [arxiv]
  • Building Machines That Learn and Think Like People [arxiv]
  • LARGER-CONTEXT LANGUAGE MODELLING WITH RECURRENT NEURAL NETWORK [arxiv]
  • A Diversity-Promoting Objective Function for Neural Conversation Model [arxiv]
  • Hierarchical Attention Networks for Document Classification [arxiv]
  • Visual Storytelling [arxiv]
  • Using Sentence-Level LSTM Language Models for Script Inference [arxiv]
  • ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs [arxiv]
  • Character-Level Question Answering with Attention [arxiv]
  • Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond [arxiv]
  • Sentence Compression by Deletion with LSTMs [link]
  • A Simple Way to Initialize Recurrent Networks of Rectified Linear Units [arxiv]
  • DenseCap: Fully Convolutional Localization Networks for Dense Captioning [arxiv]
  • Nonextensive information theoretical machine [arxiv]
  • What we write about when we write about causality: Features of causal statements across large-scale social discourse [arxiv]
  • Question Answering via Integer Programming over Semi-Structured Knowledge [arxiv]
  • Dialog-based Language Learning [arxiv]
  • Bridging LSTM Architecture and the Neural Dynamics during Reading [arxiv]
  • Neural Generative Question Answering [arxiv]
  • Recurrent Memory Networks for Language Modeling [arxiv]
  • Colorful Image Colorization [paper] [code] [note]

TODO

  • votes for papers (e.g., 👍)
  • automatic crawler for citation and search counts (e.g., cite+51, tweets+42, search+523 ) like this

About

summaries and notes on neural language learning papers (deprecated)