AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feature Request]: Dynamic colors range fix at high cfg scale through latent thresholding at every step

Ehplodor opened this issue · comments

Is there an existing issue for this?

  • I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

prevent burning / "pop art" style for images generated with high cfg scale via "latent dynamic thresholding"

Proposed workflow

  1. Go to txt2img or img2img
  2. activate latent fix (checkbox in img2img / txt2img tab ? settings ?)
  3. generate the image with high cfg scale

Additional information

inspired by https://github.com/Birch-san/stable-diffusion/blob/0e1aa752a96aa6cab07c81449acce605743bf9b6/scripts/txt2img.ipynb

corresponding tweet : https://twitter.com/Birchlabs/status/1582165379832348672

If it works as well as it appears, then I'd make it the default and add a settings page option to restore the old behavior for legacy generations. This would mean some users will be confused why their old renders changed, but checking the settings will explain the situation.

I could see a case for keeping the current behavior if there's already a config.json file. That preserves old renders, but users won't get the benefits of the fix until they notice the option and try turning it on.

I made an extension for this, based on the work in the PR from dtan3847 (PR was closed because Auto said it should be an extension, that user didn't port to extension because life took hold and they couldn't focus on coding anymore)

https://github.com/mcmonkeyprojects/sd-dynamic-thresholding

cc @ClashSAN to request you add Dynamic Thresholding (CFG Scale Fix) to the extension list.
Description can be something like: Adds customizable dynamic thresholding to allow high CFG Scale values without the burning / 'pop art' effect.

Has sample images and more info in the readme.