Implement samplers correctly

Question

Implement samplers correctly

152334H opened this issue a year ago · comments

152334H commented a year ago

rudimentary dpm++2m implementation
explore other DPM-Solver samplers
figure out if k-diffusion is still possible
UniPC

Sofian Mejjoute · Answer 1 · Wed Feb 15 2023 17:32:10 GMT+0800 (China Standard Time)

@152334H thoughts about https://github.com/wl-zhao/UniPC ?

Sofian Mejjoute · Answer 2 · Wed Feb 15 2023 17:37:58 GMT+0800 (China Standard Time)

Examples on images AUTOMATIC1111/stable-diffusion-webui#7710

152334H · Answer 3 · Wed Feb 15 2023 17:42:59 GMT+0800 (China Standard Time)

Their project says they support $\epsilon_\theta(x_t, t)$ models, so I'll give it a go

Their code in https://github.com/wl-zhao/UniPC/blob/main/example/stable-diffusion/ldm/models/diffusion/uni_pc/uni_pc.py also seems very similar to the DPM-Solver repo, which i'll be integrating soon, so that's good

152334H · Answer 4 · Wed Feb 15 2023 17:44:58 GMT+0800 (China Standard Time)

on a related note, I realised a few days ago (thanks to mrq) that my implementation of k-diffusion was actually completely wrong.

I'll be adding code that actually runs dpm++2m correctly in about an hour (the K diffusion integration is most likely screwed), then I can go for uniPC

152334H · Answer 5 · Wed Feb 15 2023 18:47:37 GMT+0800 (China Standard Time)

I'll write a larger blog about this later, but to clarify, this is what happened:

I added functions for k-diffusion, but didn't actually call on them in my code
All "DPM++2M" results before today's commit were actually done with p_sample, which is basically random sampling of the gaussian distribution with the mean and variance values calculated by the model at each step. Yes, this means that the "really good 10 step results" were actually just plain DDIM.
After realising this, I tried to fix it. As it turns out, integrating the k-diffusion library into TorToiSe's diffusion model is actually a non-trivial task, because k-diffusion defines its own sigmas that conflict with the beta scheduler that's used to calculate $x_\theta(x_t,t)$ (which k-diffusion expects) in p_mean_variance. I tried to calculate the x0 prediction from the raw model output + karras' sigmas, but I got a bunch of noise. It's entirely possible I just failed to write the right integration code for the k-diffusion samplers, but I'm not going to try working on it for now.
I have instead opted to make use of the $\epsilon_\theta(x_t, t)$ outputs from the diffusion model directly, discarding the $\Sigma_\theta(x_t, t)$ prediction, and using it with the DPM-Solver repo to run DPM++2M. This "worked", but doesn't produce good results for steps<=20, and I assume this happens because of the discarded variance || because I wrote a wrong constant somewhere.

tldr: past samplers were fake; dpm++2m is now experimental but real, DDIM+cond_free is preferable for steps < 20 until better samplers exist.

Consequently, I'm making DDIM the default sampler for ultra_fast for now, and have created a new preset (very_fast) that uses DPM++2M with more steps.

All claims stated here only apply to fp32 inference; I have no idea what the results are like on --half yet.