xaddwell / Stable-Edit

Text-based real image editing with stable diffusion models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Stable-Edit

This is the PyTorch implementation for image editing framework as described in:

Stable-Edit: Text-based real image editing with stable diffusion models

We address the consistency image editing by inversing both image and text embedding.

1. Model Structure

Stable edit framework
Overview of the proposed stable editing framework.

Specifically, provide with the input image $X$ and target text $Y$, we first learn the inversed text embedding $e_{opt}$ of image $X$. Then, we combine the target text embedding $e_{tgt}$ from text encoder and inversed text embedding $e_{opt}$ with cross attention. Next, we learn the inversed image latents according to image latents $h_{in}$ of image $X$ from VAE according to DDIM scheduler. Finally, we forward the standard text-to-image generation.

2. Cases

Edited cases
Cases for the image editing.

This repository is based on diffusers.

About

Text-based real image editing with stable diffusion models


Languages

Language:Python 100.0%