sd-webui-bayesian-merger

What is this?

An opinionated take on stable-diffusion models-merging automatic-optimisation.

The main idea is to treat models-merging procedure as a black-box model with 26 parameters: one for each block plus base_alpha (note that for the moment clip_skip is set to 0). We can then try to apply black-box optimisation techniques, in particular we focus on Bayesian optimisation with a Gaussian Process emulator. Read more here, here and here.

The optimisation process is split in two phases:

exploration: here we sample (at random for now, with some heuristic in the future) the 26-parameter hyperspace, our block-weights. The number of samples is set by the --init_points argument. We use each set of weights to merge the two models we use the merged model to generate batch_size * number of payloads images which are then scored.
exploitation: based on the exploratory phase, the optimiser makes an idea of where (i.e. which set of weights) the optimal merge is. This information is used to sample more set of weights --n_iters number of times. This time we don't sample all of them in one go. Instead, we sample once, merge the models, generate and score the images and update the optimiser knowledge about the merging space. This way the optimiser can adapt the strategy step-by-step.

At the end of the exploitation phase, the set of weights scoring the highest score are deemed to be the optimal ones.

Juicy features

wildcards support
TPE or Bayesian Optimisers. cf. Bergstra et al., Algorithms for Hyper-Parameter Optimization 2011 for a comparison and explanation
UNET visualiser
convergence plot

OK, How Do I Use It In Practice?

Head to the wiki for all the instructions to get you started.

FAQ

Why not sdweb-auto-MBW extension? That amazing extension is based on brute-forcing the merge. Unfortunately, Brute force == long time to wait, especially when generating lots of images. Hopefully, with this other method you can get away with a small number of runs!
Why opinionated? Because we use webui API and lots of config files to run the show. No GUI. Embrace your inner touch-typist and leave the browser for the CLI.
Why rely on webui? It's a very popular platform. Chances are that if you already have a working webui, you do not need to do much to run this library.
How many iterations and payloads? What about the batch size? I'd suggest --init_points 10 --n_iters 10 --batch_size 10 and at least 5 different payloads. Depending on your GPU this may take 2-3hrs to run on basic config.
Why not using hydra for config management? a single .ini file is easy to handle. Hydra's config management workflow seemed overkill for this project.

JonnoFTW / sd-webui-bayesian-merger

sd-webui-bayesian-merger

What is this?

Juicy features

OK, How Do I Use It In Practice?

FAQ

With the help of

About

Languages