The course is devoted to modern generative models (mostly in the application to computer vision).
We will study the following types of generative models:
- autoregressive models,
- latent variable models,
- normalization flow models,
- adversarial models,
- diffusion models.
Special attention is paid to the properties of various classes of generative models, their interrelationships, theoretical prerequisites and methods of quality assessment.
The aim of the course is to introduce the student to widely used advanced methods of deep learning.
The course is accompanied by practical tasks that allow you to understand the principles of the considered models.
- telegram: @roman_isachenko
- e-mail: roman.isachenko@phystech.edu
# | Date | Description | Slides | Video |
---|---|---|---|---|
1 | September, 5 | Lecture 1: Logistics. Generative models overview and motivation. Problem statement. Divergence minimization framework. Autoregressive models (PixelCNN). | slides | video |
Seminar 1: Introduction. Maximum likelihood estimation. Histograms. Kernel density estimation (KDE). | notebook | video | ||
2 | September, 12 | Lecture 2: Bayesian Framework. Latent Variable Models (LVM). Variational lower bound (ELBO). EM-algorithm, amortized inference. | slides | video |
Seminar 2: PixelCNN for MNIST and Binarized MNIST coding. | notebook notebook_solved |
video | ||
3 | September, 19 | Lecture 3: ELBO gradients, reparametrization trick. Variational Autoencoder (VAE). VAE limitations. Tighter ELBO (IWAE). | slides | video |
Seminar 3: Latent Variable Models. Gaussian Mixture Model (GMM). GMM and MLE. ELBO and EM-algorithm. GMM via EM-algorithm. | notebook |
video | ||
4 | September, 26 | Lecture 4: Normalizing Flow (NF) intuition and definition. Forward and reverse KL divergence for NF. Linear NF. Gaussian autoregressive NF. | slides | video |
Seminar 4: Variational EM algorithm for GMM. VAE: Implementation hints + Vanilla 2D VAE coding. | notebook notebook_solved |
video | ||
5 | October, 3 | Lecture 5: Coupling layer (RealNVP). NF as VAE model. Discrete data vs continuous model. Model discretization (PixelCNN++). Data dequantization: uniform and variational (Flow++). | slides | video |
Seminar 5: VAE: posterior collapse, KL-annealing, free-bits. Normalizing flows: basics, planar flows, forward and backward kl for planar flows. | posterior_collapse | video | ||
6 | October, 10 | Lecture 6: ELBO surgery and optimal VAE prior. NF-based VAE prior. Discrete VAE latent representations. Vector quantization, straight-through gradient estimation (VQ-VAE). | slides | video |
Seminar 6: Planar Flow (coding), RealNVP. | planar_flow.ipynb real_nvp_notes.ipynb | video | ||
7 | October, 17 | Lecture 7: Gumbel-softmax trick (DALL-E). Likelihood-free learning. GAN optimality theorem. | slides | video |
Seminar 7: Glow. | Glow | video | ||
8 | October, 24 | Lecture 8: Wasserstein distance. Wasserstein GAN (WGAN). WGAN with gradient penalty (WGAN-GP). Spectral Normalization GAN (SNGAN). | slides | video |
Seminar 8: Vanilla GAN in 1D coding. KL vs JS divergences. Mode collapse. Non-saturating GAN. | part_1 part_2 |
video | ||
9 | October, 31 | Lecture 9: f-divergence minimization. GAN evaluation. Inception score, FID, Precision-Recall, truncation trick. | slides | video |
Seminar 9: WGANs on multimodal 2D data. GANs zoo and evolution of GANs. StyleGAN coding. | notebook GANs_evolution StyleGAN |
video | ||
10 | November, 14 | Lecture 10: Neural ODE. Adjoint method. Continuous-in-time NF (FFJORD, Hutchinson's trace estimator). | slides | video |
Seminar 10: StyleGAN: end discussions. Energy-Based models. | notebook |
video | ||
11 | November, 21 | Lecture 11: Gaussian diffusion process. Gaussian diffusion model as VAE, derivation of ELBO. | slides | video |
Seminar 11: Gaussian diffusion process basics. | notes.pdf | video | ||
12 | November, 28 | Lecture 12: Denoising diffusion probabilistic model (DDPM): reparametrization and overview. Kolmogorov-Fokker-Planck equation and Langevin dynamic. SDE basics. | slides | video |
Seminar 12: Fast samplers: iDDPM and DDIM | notes.pdf | video | ||
13 | December, 5 | Lecture 13: Score matching: implicit/sliced score matching, denoising score matching. Noise Conditioned Score Network (NCSN). DDPM vs NCSN. | slides | video |
Seminar 13: Noise Conditioned Score Network | notebook | video | ||
14 | December, 12 | Lecture 14: Variance Preserving and Variance Exploding SDEs. Model guidance: classifier guidance, classfier-free guidance. | slides | video |
- 6 homeworks each of 13 points = 78 points
- oral cozy exam = 26 points
- maximum points: 78 + 26 = 104 points
- probability theory + statistics
- machine learning + basics of deep learning
- python + basics of one of DL frameworks (pytorch/tensorflow/etc)