AMEERAZAM08 / mindiffusion

Repository of lessons exploring image diffusion models, focused on understanding and education.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Zero-to-Hero - Diffusion Models

Zero to Hero

python pytorch

Repository of lessons exploring image diffusion models, focused on understanding and education.

Introduction

This series is heavily inspired by Andrej Karpathy's Zero to Hero series of videos. Well, actually, we are straight out copying that series, because they are so good. Seriously, if you haven't followed his videos, go do that now - lot's of great stuff in there!

Each lesson contains both an explanatory video which walks you through the lesson and the code, a colab notebook that corresponds to the video material, and a a pointer to the runnable code in github. All of the code is designed to run on a minimal GPU. We test everything on T4 instances, since that is what colab provides at the free tier, and they are cheap to run on AWS as stand alone instances. Theoretically each of the lessons should be runnable on any 8GB or greater GPU, as they are all designed to be trained in real time on minimal hardware, so that we can really dive into the code.

Each lesson is in its own subdirectory, and we have ordered the lessons in historical order (from oldest to latest) so that its easy to trace the development of the research and see the historical progress of this space.

Since every lesson is meant to be trained in real time with minimal cost, most of the lessons are restricted to training on the MNIST dataset, simply because it is quick to train and easy to visualize.

Requirements for All Lessons

All lessons are built using PyTorch and written in Python 3. To setup an environment to run all of the lessons, we suggest using conda or venv:

> python3 -m venv mindiffusion_env
> source mindiffusion_env/bin/activate
> pip install --upgrade pip
> pip install -r requirements.txt

All lessons are designed to be run in the lesson directory, not the root of the repository.

Table of Lessons

Lesson Date Name Title Video Colab Code
1 Introduction to Diffusion Models colab
2 March 2015 DPM Deep Unsupervised Learning using Nonequilibrium Thermodynamics colab code
3 July 2019 NCSN Generative Modeling by Estimating Gradients of the Data Distribution code
4 June 2020 NCSNv2 Improved Techniques for Training Score-Based Generative Models code
5 June 2020 DDPM Denoising Diffusion Probabilistic Models code
5a DDPM with Dropout code
5b Interpolation in Latent Space code
5c Adding Control - Basic Class Conditioning with Cross-Attention code
5d Adding Control - Extended Class Conditioning code
5e Adding Control - Text-to-Image code
6 October 2020 DDIM Denoising Diffusion Implicit Models code
7 November 2020 Score SDE Score-Based Generative Modeling through Stochastic Differential Equations code
8 February 2021 DaLL-E Zero-Shot Text-to-Image Generation code
9 February 2021 IDDPM Improved Denoising Diffusion Probabilistic Models code
10 April 2021 SR3 Image Super-Resolution via Iterative Refinement code
11 May 2021 Guided Diffusion Diffusion Models Beat GANs on Image Synthesis code
12 May 2021 CDM Cascaded Diffusion Models for High Fidelity Image Generation code
13 December 2021 Latent Diffusion High-Resolution Image Synthesis with Latent Diffusion Models code
13a Stable Diffusion v1
13b Stable Diffusion v2
14 December 2021 CFG Classifier-Free Diffusion Guidance code
15 December 2021 GLIDE GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models code
16 February 2022 Progressive Distillation for Fast Sampling of Diffusion Models
17 April 2022 DaLL-E 2 Hierarchical Text-Conditional Image Generation with CLIP Latents code
18 May 2022 Imagen Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding code
19 October 2022 Flow Matching for Generative Modeling
20 October 2022 ERNIE-ViLG 2.0 ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts
21 December 2022 DiT Scalable Diffusion Models with Transformers code
22 January 2023 Simple Diffusion Simple diffusion: End-to-end diffusion for high resolution images
23 February 2023 ControlNet Adding Conditional Control to Text-to-Image Diffusion Models
24 May 2023 RAPHAEL RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths
25 June 2023 Wuerstchen Wuerstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models
26 July 2023 SDXL SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
27 September 2023 PixArt-α PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis code
28 October 2023 DaLL-E 3 Improving Image Generation with Better Captions
29 January 2024 PIXART-δ PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models
30 March 2024 Stable Diffusion 3 Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
31 March 2024 PixArt-Σ PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

About

Repository of lessons exploring image diffusion models, focused on understanding and education.


Languages

Language:Python 99.0%Language:Cuda 0.9%Language:C++ 0.1%