vvictoryuki / FreeDoM

[ICCV 2023] Official PyTorch implementation for the paper "FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

some question about coefficient "rho"

FromA2Z opened this issue · comments

Hello, I am interested in the code you posted, thank you for sharing. What puzzles me is that there is not much discussion of scale factor in the paper.
In SD Style, rho appears to be a learning rate associated with both "grad" and "classification guided effects",as shown below
1692454027621
However, in Face ID, rho is equal to at.sqrt(), as follows:
b130ea19a84525988e6f3db1801c4a5
So, how exactly do we set up RHO, and is there some mathematical theory to support it? Thank you?

@FromA2Z Thank you for your attention to our work! Regarding rho or the learning rate, it is an important parameter used to control the step size of guidance. Poorly chosen strategies for its setting can result in undesirable generation outcomes. The two strategies you've identified are two of the relatively better ones we've found:

  • Dynamically adjusting rho based on the length of guidance or score.
  • In experiments related to human faces, using at.sqrt() is a promising setting. However, it requires stopping the guidance after a certain number of steps, which is what the stop parameter accomplishes.

A detailed description of parameter settings is provided in the camera-ready version. We suggest you experiment with different parameter settings using the code we've provided. This should lead to numerous new discoveries. Thank you~