tpapp / DynamicHMC.jl

Implementation of robust dynamic Hamiltonian Monte Carlo methods (NUTS) in Julia.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Gaussian kinetic energy centered at nonzero point

jzhan039 opened this issue · comments

Hi Tamas,

If I wanted to construct a Gaussian kinetic energy centered at nonzero points, how can I do it in this library? For example, if I have 2 parameters in the distribution, (x, y), and somehow I know my problem has an approximated kinetic energy (|p| - p0)^2 / 2M, where |p| = (px^2 + py^2)^(1/2), p0, M are constant, I think using a Gaussian kinetic energy centered at p0 will be more efficient, right? And it's the radial length of momentum (px, py) obey Gaussian distribution centered at p0, not each one does independently, I'm not sure if there are other things I need to care about other than redefining the kinetic energy.

Thanks.

Jin

All the NUTS/HMS implementations I am familiar with use a symmetric KE specification, mostly Gaussian (I am aware of some experiments with fat-tailed KEs, but they didn't improve much in practice). I am not aware of any theoretical reasons for using non-symmetric KEs, but maybe I missed something.

Currently the symmetry of KE is very deeply engrained in the implementation of DynamicHMC — eg see the docs of KineticEnergy. If you really want to experiment with this, you would need to define another type and reimplement pretty much all of src/hamiltonian.jl, and then add a reversibility correction somewhere (this I would have to look into, but I am willing to extend the interface if you really want to do this).

Unless you are an expert in differential geometry and you are convinced that this would help you, I would really advise against this, it is quite a bit of work with dubious benefits.

It's similar to the Hamiltonian dynamics in a magnetic field. The Hamilton's equations describe the time evolution of coordinates and the canonical momentum. But the kinetic momentum has a shift from the canonical momentum. I just found a reference https://arxiv.org/abs/1607.02738v2, where they introduce the magnetic HMC, and give examples showing improvements for the method.

I agree that it is very interesting. Please take a look at the existing code and let me know if you want to experiment with this. I am very happy to help by making generalizations to the API to allow this; as I said above I would need to do momentum flipping for the Metropolis steps, but there may be something else.

I realized that it could be mathematically equivalent to a special case of Riemann HMC, or at least comparably efficient. Maybe it's not worth the time. I'll take a deeper thinking some time.

OK, let's leave this issue open in the meantime.

Incidentally, the tests already contain mixtures like the one mentioned in the paper, they just fail with NUTS if the "valley" is too low (as expected). This can happen a lot in real-life models, so I am interested in fixing it. I am holding off on RHMC since second derivatives are not really practical with AD at the moment.

I am closing this because of lack of activity; feel free to ping here if you are still interested and I will reopen.