Learn non-linear reparametrization during warm-up

Question

Learn non-linear reparametrization during warm-up

nsiccha opened this issue 9 months ago · comments

It's possible to easily and efficiently learn global non-linear reparametrizations during MCMC warm-up, at a cost comparable to "a few" gradient evaluations of the log prior/jacobian adjustment.

The reparametrizations would be global in the sense that they are different from RHMC and much more similar to the class of reparametrizations used e.g. here: https://arxiv.org/abs/1906.03028.

The current DynamicHMC API makes it a bit complicated to implement, but the DynamicHMC+LogDensityProblems implementation/API could easily be extended to allow users to implement "custom" families of automatic reparametrizations by implementing just a few simple functions.

A working prototype implementation/extension (with the posterior specific functions missing) can be found here.

The functions which need to be implemented per posterior are (mainly) transformations and log jacobian adjustments and/or some function which uses gradients to minimize a loss function defined by those two functions. I'd imagine these functions can in principle automatically be provided by Bijectors.jl or similar packages, but providing those functions should be the responsibility of the user and/or a different package then DynamicHMC.

Tamas K. Papp · Answer 1 · Tue Aug 22 2023 15:11:34 GMT+0800 (China Standard Time)

Thanks for opening the issue. So, if I understand correctly, the proposal is to introduce an API for reparametrization, which users could implement, and a warmup stage that does this?

nsiccha · Answer 2 · Tue Aug 22 2023 15:28:17 GMT+0800 (China Standard Time)

Yes, more or less exactly.

a warmup stage that does this

In the linked implementation, I'm kinda hijacking (out of laziness) the mass matrix adaptation stage to also do the nonlinear reparametrization, but of course in principle you could also just do the mass matrix adaption, as is being done currently. Whenever you are doing the nonlinear reparametrization, you must however also redo the mass matrix adaptation, but you'd have the transformed samples anyways.

introduce an API for reparametrization

Yeah, that's what it boils down to. The most general thing to do in the context of windowed MCMC warm-up would be to allow the user to implement (dispatches for)

a find_reparametrization function (as called here) which takes the LogDensityProblem and a matrix of draws and returns a reparametrized LogDensityProblem and
a reparametrize function (as called e.g. here) which takes a source parametrization, a target parametrization and a matrix of draws from the source parametrization and returns the same draws but in the new target parametrization.

The find_reparametrization function could in principle be anything, but in practice something, which uses one of Julias AD libraries combined with some gradient based optimizer minimizing some loss function which uses the reparametrize and some additional cheap and easy to implement function, seems to work quite well.