(IA)^3 for Stable Diffusion

Parameter-efficient fine-tuning of Stable Diffusion using (IA)^3.

YouTube Video Explanation

Before Fine-Tuning	After Fine-Tuning

The prompt is "donald trump", and the model is fine-tuned on pokemon-blip-captions for 25 epochs.

Based on these papers:

Implemented in diffusers using an attention processor in attention.py.

(IA)^3 has trade-offs similar to LoRA when comparing to full fine-tuning.

One major difference to LoRA is that (IA)^3 uses much less parameters. In general, it will most likely be faster and smaller, but less expressive.

Faster training
Smaller file size (~222 KB for Stable Diffusion 1.5 when learn_biases=False, about twice as much otherwise)
Can be swapped in and out of the base model during inference
Can be loaded into fine-tuned models that have the same architecture
Can be merged with the weights of the base model
- Only possible when learn_biases=False without changing the architecture
- Not currently implemented in this repo

First create an environment and install PyTorch.

Then install the pip dependencies:

pip install -r requirements.txt

Currently, bitsandbytes only supports Linux, so fine-tuning on Windows requires more VRAM.

Currently you can change the parameters by editing the variables at the top of the file and running the script:

python train.py

Inference script in infer.py to load the changes and generate images.

Currently you can change the parameters by editing the variables at the top of the file and running the script:

python infer.py