SusungHong / Self-Attention-Guidance

The implementation of the paper "Improving Sample Quality of Diffusion Models Using Self-Attention Guidance" (ICCV`23)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Alternative implementation in Refiners

deltheil opened this issue · comments

We are building Refiners, an open source, PyTorch-based framework made to easily train and run adapters on top of foundational models. Just wanted to let you know that Self-Attention Guidance is now natively supported for Stable Diffusion 1.5 and XL!

Demo:

  1. Follow these install steps
  2. Run the code snippet below:
import torch

from refiners.foundationals.latent_diffusion import StableDiffusion_1
from refiners.fluxion.utils import manual_seed


device = "cuda"

sd15 = StableDiffusion_1(device="cuda", dtype=torch.float16)
sd15.clip_text_encoder.load_from_safetensors("clip_text.safetensors")
sd15.lda.load_from_safetensors("lda.safetensors")
sd15.unet.load_from_safetensors("unet.safetensors")

with torch.no_grad():
    prompt = "a cute cat, detailed high-quality professional image"
    negative_prompt = "lowres, bad anatomy, bad hands, cropped, worst quality"

    clip_text_embedding = sd15.compute_clip_text_embedding(text=prompt, negative_text=negative_prompt)

    # Turn on Self-Attention Guidance (in addition to Classifier-Free Guidance, always on)
    sd15.set_self_attention_guidance(enable=True, scale=0.75)

    manual_seed(2)
    x = torch.randn(1, 4, 64, 64, device=device, dtype=torch.float16)

    for step in sd15.steps:
        x = sd15(
            x,
            step=step,
            clip_text_embedding=clip_text_embedding,
            condition_scale=7.5,
        )
    predicted_image = sd15.lda.decode_latents(x)

predicted_image.save("output.png")
print("done: see output.png")

With Self-Attention Guidance (scale=0.75):

output_after

Comparison before / after:

diff.mov

Feedback welcome!