point0bar1 / ebm-anatomy

PyTorch implementation of Algorithm 1 of "On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Step size in Langevin Dynamics

XavierXiao opened this issue · comments

Hi, in your code, when you do the langevin dynamics, you run
x_s_t.data += - f_prime + config['epsilon'] * t.randn_like(x_s_t)
However, does this mean that the step size for the gradient f_prim is 1? Should we run
x_s_t.data += - 0.5*config['epsilon']**2*f_prime + config['epsilon'] * t.randn_like(x_s_t) instead?

Thanks for the inquiry! Our Langevin implementation intentionally drops the standard Langevin coefficient for f_prime. This means that the actual learned energy is

U(x) = f(x) / (0.5*config['epsilon']**2)

which is a tempered relative of f(x). This is done for technical reasons that relate to weight initialization that are discussed in Appendix Part A that is in the repository.