lluisgomez / prompt-to-prompt-with-sdxl

An implementation of the Prompt-to-Prompt paper for the SDXL architecture

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Prompt-to-prompt (P2P) with SDXL

An implementation of Prompt-to-Prompt for the SDXL architecture.

What is Prompt-to-prompt (P2P)?

P2P is an editing technique that utilizes self- and cross-attention inherent in the diffusion process, and does not rely on external tools to make local and global edits.

It takes advantage of cross- and self-attention by generating two images at the same time: an original image, and another image (the result of the edit) with some modification in its prompt. For example, "a pink bear" and "a pink dragon". It then injects the attentions of "bear" to "dragon" during the diffusion, which in return preserves the style of the original image, but also replaces "bear" with "dragon".

For more information, I highly recommend checking out the original project page and paper of this work (linked below).

Why use this implementation?

While Stable Diffusion is no longer state-of-the-art, it reamins a popular base model for ongoing research due to its well-established implementations. P2P is frequently cited and remains a significant foundation for work. To make sure good research is not left behind, it's worth updating its infrastructure.

I hope this implementation encourages curious minds to explore the extent P2P's utility as Diffusion models continue to scale.

What can I do with P2P?

P2P has three main operations, I'm including some examples below, but the official resources explain it in-depth.

Replace

The replace operation swaps the effect of one token with a new token.

a pink bear riding a bicycle on the beach a pink dragon riding a bicycle on the beach

Refine

The refine operation adds an effect to an existing token, for example, an adjective.

a chocolate cake a confetti chocolate cake

Reweight

The reweight operation generates the same image, but amplifies or attenuates the effect of a target token in the prompt. Below is an example of an attenuation of "blue" in "a blue dog".

Original: a blue dog Attenuated: a (less) blue dog

Credit

The original Prompt-to-Prompt project and the great researchers who worked on it: Amir Hertz, Ron Mokady, Jay Tenenbaum, Kfir Aberman, Yael Pritch, Daniel Cohen-Or.

This code builds on the Huggingface's community pipeline of Prompt-to-prompt (Stable Diffusion implementation), contributed by UmerHA.

About

An implementation of the Prompt-to-Prompt paper for the SDXL architecture


Languages

Language:Jupyter Notebook 99.1%Language:Python 0.9%