The course is devoted to modern generative models (mostly in the application to computer vision).
We will study the following types of generative models:
- autoregressive models,
- latent variable models,
- normalization flow models,
- adversarial models,
- diffusion models.
Special attention is paid to the properties of various classes of generative models, their interrelationships, theoretical prerequisites and methods of quality assessment.
The aim of the course is to introduce the student to widely used advanced methods of deep learning.
The course is accompanied by practical tasks that allow you to understand the principles of the considered models.
- telegram: @roman_isachenko
- e-mail: roman.isachenko@phystech.edu
# | Date | Description | Slides | Video |
---|---|---|---|---|
1 | September, 6 | Lecture 1: Logistics. Generative models overview and motivation. Problem statement. Divergence minimization framework. Autoregressive modelling. | slides | video |
2 | September, 13 | Seminar 1: Introduction. Maximum likelihood estimation. Histograms. Kernel density estimation (KDE). | notebook | video |
3 | September, 20 | Lecture 2: Autoregressive models (WaveNet, PixelCNN). Bayesian Framework. Latent Variable Models (LVM). Variational lower bound (ELBO). | slides | video |
4 | September, 27 | Seminar 2: MADE theory and practice. PixelCNN implementation hints. Gaussian MADE. | notebook | video |
5 | October, 4 | Lecture 3: EM-algorithm, amortized inference. ELBO gradients, reparametrization trick. Variational Autoencoder (VAE). | slides | video |
6 | October, 11 | Seminar 3: Latent Variable Models. Gaussian Mixture Model (GMM). GMM and MLE. ELBO and EM-algorithm. GMM via EM-algorithm. | notebook | video |
7 | October, 18 | Lecture 4: VAE limitations. Posterior collapse and decoder weakening. Tighter ELBO (IWAE). Normalizing flows prerequisities. | slides | video |
8 | October, 25 | Seminar 4: VAE implementation hints. IWAE theory. | notebook | video |
9 | November, 1 | Lecture 5: Normalizing Flow (NF) intuition and definition. Forward and reverse KL divergence for NF. Linear flows. | slides | video |
10 | November, 8 | Seminar 5: Flows. Planar flows. Forward KL vs Reverse KL. Planar flows via Forward KL and Reverse KL. | notebook planar_flow_practice autograd_jacobian |
video |
11 | November, 15 | Lecture 6: Autoregressive flows (gausian AR NF/inverse gaussian AR NF). Coupling layer (RealNVP). NF as VAE model. | slides | video |
12 | November, 22 | Seminar 6: RealNVP implementation hints. Integer Discrete Flows | notebook_part1 notebook_part2 |
video |
13 | November, 29 | Lecture 7: Discrete data vs continuous model. Model discretization (PixelCNN++). Data dequantization: uniform and variational (Flow++). ELBO surgery and optimal VAE prior. Flow-based VAE prior. | slides | video |
14 | December, 6 | Seminar 7: Discretization of continuous distribution (MADE++). Aggregated posterior distribution in VAE. VAE with learnable prior. | notebook_part1 notebook_part2 |
video |
15 | February, 7 | Lecture 8: Flows-based VAE posterior vs flow-based VAE prior. Likelihood-free learning. GAN optimality theorem. | slides | video |
16 | February, 14 | Seminar 8: Glow implementation. Vanilla GAN in 1D coding. | VanillaGAN_todo VanillaGAN_done Glow |
video |
17 | February, 21 | Lecture 9: Vanishing gradients and mode collapse, KL vs JS divergences. Adversarial Variational Bayes. Wasserstein distance. Wasserstein GAN (WGAN). | slides | video |
18 | February, 28 | Seminar 9: KL vs JS divergences. Mode collapse. Vanilla GAN on multimodal 1D and 2D data. Wasserstein distance theory. | notebook WGAN_theory |
video |
19 | March, 7 | Lecture 10: WGAN with gradient penalty (WGAN-GP). Spectral Normalization GAN (SNGAN). f-divergence minimization. GAN evaluation. | slides | video |
20 | March, 14 | Seminar 10: WGANs on multimodal 2D data. GANs zoo. Evolution of GANs. StyleGAN implementation. | notebook_todo notebook_done GANs_evolution StyleGAN |
video |
21 | March, 21 | Lecture 11: GAN evaluation (Inception score, FID, Precision-Recall, truncation trick). Discrete VAE latent representations. | slides | video |
22 | March, 28 | Seminar 11: StyleGAN coding and assessing. Unpaired I2I translation. CycleGAN: discussion and coding. | notebook_todo notebook_done |
video |
23 | April, 4 | Lecture 12: Vector quantization, straight-through gradient estimation (VQ-VAE). Gumbel-softmax trick (DALL-E). Neural ODE. | slides | video |
24 | April, 11 | Seminar 12: Beyond GANs: Neural Optimal Transport: theory and practice. VQ-VAE implementation hints. | notebook NOT_theory; NOT seminar by Alex Korotin: notebook, solutions |
video |
25 | April, 18 | Lecture 13: Adjoint method. Continuous-in-time NF (FFJORD, Hutchinson's trace estimator). Kolmogorov-Fokker-Planck equation and Langevin dynamic. SDE basics. | slides | video |
26 | April, 25 | Seminar 13: CNF theory. Langevin Dynamics. Energy-based Models. | notebook | video |
27 | May, 2 | Lecture 14: Score matching. Noise conditioned score network (NCSN). Gaussian diffusion process. | slides | video |
29 | May, 16 | Lecture 15: Denoising diffusion probabilistic model (DDPM): objective, link to VAE and score matching. | slides | video |
May, 23 | Oral exam |
- 6 homeworks each of 13 points = 78 points
- oral cozy exam = 26 points
- maximum points: 78 + 26 = 104 points
- probability theory + statistics
- machine learning + basics of deep learning
- python + basics of one of DL frameworks (pytorch/tensorflow/etc)