- Learning to Speak and Act in a Fantasy Text Adventure Game [arXiv] [notes]
- Improving Robustness of Machine Translation with Synthetic Noise [arXiv] [notes]
- Short-term meaning shift: an exploratory distributional analysis [arXiv] [notes]
- Deep contextualized word representations [arXiv] [notes]
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [arXiv] [notes]
- Dissecting Contextual Word Embeddings: Architecture and Representation [arXiv] [notes]
- Linguistic Knowledge and Transferability of Contextual Representations [arXiv] [notes]
- Language Modeling Teaches You More Syntax than Translation Does: Lessons Learned Through Auxiliary Task Analysis [arXiv] [notes]
- To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks [arXiv] [notes]
- What do you learn from context? Probing for sentence structure in contextualized word representations [OpenReview] [notes]
- Learning from Dialogue after Deployment: Feed Yourself, Chatbot! [arXiv] [notes]
- Cross-lingual Language Model Pretraining [arXiv] [notes]