magorshunov / transformers_course

Lesson 1: Introduction to attention and language models (1h)

1.1 A brief history of NLP (15 min)
1.2 Paying attention with attention (15 min)
1.3 Encoder-decoder architectures (15 min)
1.4 How language models look at text (15 min)

Lesson 2: How transformers use attention to process text (1h)

2.1 Introduction to transformers (10 min)
2.2 Scaled dot product attention (30 min)
2.3 Multi-headed attention (20 min)

Lesson 3: Transfer Learning (45m)

3.1 Introduction to Transfer Learning (15 min)
3.2 Introduction to Pytorch (15 min)
3.3 Fine-tuning transformers with Pytorch (15 min)

Lesson 4: Natural Language Understanding with BERT (1h)

4.1 Introduction to BERT (15 min)
4.2 Encoders need only apply: BERT’s architecture (15 min)
4.3 Wordpiece tokenization (15 min)
4.4 The many embeddings of BERT (15 min)

Lesson 5: Pre-training and fine-tuning BERT (45m)

5.1 The Masked Language Modeling Task (15 min)
5.2 The Next Sentence Prediction Task (15 min)
5.3 Fine-tuning BERT to solve NLP tasks (15 min)

Lesson 6: Hands on BERT (1h 15m)

6.1 Flavors of BERT (15 min)
6.2 BERT for sequence classification (20 min)
6.3 BERT for token classification (20 min)
6.4 BERT for question/answering (20 min)

Lesson 7: Natural Language Generation with GPT (1h 15m)

7.1 Introduction to the GPT family (10 min)
7.2 Decoders need only apply: GPT’s architecture (15 min)
7.3 Masked multi-headed attention (15 min)
7.4 Pre-training GPT (10 min)
7.5 Few-shot learning (10 min)
7.6 Multi-task learning (10 min)

Lesson 8: Hands on GPT (1h)

8.1 Off the shelf GPT results using few shot learning (20 min)
8.2 GPT for style completion (20 min)
8.3 GPT for code dictation (20 min)

Lesson 9: Further applications of BERT + GPT (1h)

9.1 Siamese BERT-networks for semantic searching (30 min)
9.2 Teaching GPT multiple tasks at once with prompt engineering (30 min)

Lesson 10: T5: back to basics (35m)

10.1 Encoders and decoders welcome: T5’s architecture (15min)
10.2 Cross-attention (20 min)

Lesson 11: Hands on T5 (50m)

11.1 Off the shelf results with T5 (20 min)
11.2 Using T5 for abstractive summarization (30 min)

Lesson 12: The vision transformer (1h)

12.1 Introduction to the Vision Transformer (ViT) (15min)
12.2 Combining ViT and GPT to caption images (15 min)
12.3 Fine-tuning an image captioning system (30 min)

Lesson 13: Deploying Transformer models (1h)

13.1 Introduction to MLOps (20 min)
13.2 Sharing our models on HuggingFace (15 min)
13.3 Deploying a fine-tuned BERT model using FastAPI (25 min)

Lesson 14: Using Massively Large Language Models (1h)

14.1 Modern Large Language Models (20 min)
14.2 GPT-3 + ChatGPT (15 min)
14.3 Other LLMs + Semantic Search with OpenAI Embeddings (25 min)

About

Languages

Language:Jupyter Notebook 100.0%Language:Python 0.0%Language:Dockerfile 0.0%