- Skip N-grams and Ranking Functions for Predicting Script Events
- A Structured Self-Attentive Sentence Embedding
- Attention Is All You Need
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Summaries and notes of paper I have read