Step size in Langevin Dynamics
XavierXiao opened this issue · comments
Hi, in your code, when you do the langevin dynamics, you run
x_s_t.data += - f_prime + config['epsilon'] * t.randn_like(x_s_t)
However, does this mean that the step size for the gradient f_prim is 1? Should we run
x_s_t.data += - 0.5*config['epsilon']**2*f_prime + config['epsilon'] * t.randn_like(x_s_t)
instead?
Thanks for the inquiry! Our Langevin implementation intentionally drops the standard Langevin coefficient for f_prime. This means that the actual learned energy is
U(x) = f(x) / (0.5*config['epsilon']**2)
which is a tempered relative of f(x). This is done for technical reasons that relate to weight initialization that are discussed in Appendix Part A that is in the repository.