Proposed extensions

Question

Proposed extensions

nfoti opened this issue 12 years ago · comments

Hi,

First, I'm really glad someone has started a project like this for Julia as the language has a lot of potential for implementing MCMC algorithms. I'd like to propose some small extensions to the package.

First, the mh_sampler function is really a metropolis sampler and could be renamed. Then a true Metropolis-Hastings sampling function could be implemented that allows the user to input a proposal distribution (that may not be symmetric).

Second, a multivariate Metropolis sampling function that allows the user to specify a covariance matrix for the proposals could be useful.

Third, a memoized differentiable density may be useful, that is densities where the density and gradient functions share a lot of computations could be made much more efficient by computing both simultaneously and caching.

Last, incorporating some adaptive MCMC techniques would be interesting. But they could also be a package of their own.

Let me know if people are interested. I'm willing to implement some of these changes (albeit I'm not sure when).

Also, the current version of mcmc.jl did not work with the latest build of Julia (10/28/2012) and I fixed up the files to work. I can issue a pull request if the changes are useful.

Thanks.

Chris DuBois · Answer 1 · Tue Oct 30 2012 12:23:47 GMT+0800 (China Standard Time)

I agree with all three points! I would say the memoization is a lower priority, but that may just be me. I would be very willing to accept any pull requests you may have. Also, what adaptive methods in particular do you have in mind?

As long as we are discussing future directions, I'm also looking forward to making things faster by either 1) waiting for functions to be dealt with more directly by the compiler or 2) using macros to create functions that generate particular samplers. See this discussion for more details on this.

Nick Foti · Answer 2 · Tue Oct 30 2012 21:26:52 GMT+0800 (China Standard Time)

Thanks for the feedback.

I agree memoization is a low priority, I just thought I would record it for the future.

For adaptive mcmc I was thinking very simplistic things like using the empirical variance of the chain as the proposal variance (with diminishing influence to ensure convergence).

I was actually looking at the discussion you linked to recently. Personally, I think that as long as it's possible for closures to become fast in Julia and will eventually be implemented, then it's probably worth keeping the design and usage as is. In fact, even if closures remain slow, the current usage seems to match the Julia philosophy which is nice for users as it should be intuitive to use the mcmc.jl functions.

However, if the Julia people say that making closures fast is not possible for some fundamental reason then maybe the macro trick is the right thing to do. One comment on the macro trick though is that won't it still incur a performance penalty because you have to construct the function on the fly? Maybe I'm just missing something obvious. How were you thinking of using the macro trick?

In reality, people will use a closure and the slice sampling function as a convenience and if they find it's a bottleneck for their sampler then they'll probably just write the slice sampling code inline and write a specialized function to compute the density (I have done this for Gibbs samplers Matlab and Python that have a slice sampling step). The same is true with the MH samplers.

This discussion does some performance evaluation for a case that closures could arise in an MCMC sampler. The results were a ~10-100X performance hit. Perhaps it would be good to code up some examples of "more realistic" samplers, maybe from Monte Carlo Statistical Methods (or the Intro version for R) by Robert and Casella, and see how the current interface (that is use a closure to create a 1d function to pass to the mcmc.jl functions) fares on those samplers. If the closure system is too slow to deal with moderate examples like in those books then maybe the macro trick is the way to go for the time being.

Lastly, mcmc.jl may want to follow Matlab and allow the user to specify the number of samples desired, the number of burn-in iterations and the number of thinning iterations as optional parameters in case the user is sampling from a univariate distribution.

I'll make sure my code is clean and submit a pull request for the minor fixes to get the package working on the current julia build later today or tonight.

Thanks.

Chris DuBois · Answer 3 · Wed Oct 31 2012 02:13:20 GMT+0800 (China Standard Time)

I have yet to check out the performance hit myself, and I agree the best way forward is to test it out on some simple univariate examples. I likely won't be able to get to this for the next 2 weeks or so.

I also agree it wouldn't hurt to add a few features to the mcmc function. I threw that in mostly as an example of how one might use these other samplers in practice.

I also plan on cleaning up the examples and adding tests. I'd be very interested in any illustrative examples you may have.

On the wishlist: an easier interface for creating diagnostic plots. So far I have been waiting for Julia graphics to get a little more finalized.

Nick Foti · Answer 4 · Thu Nov 01 2012 06:53:38 GMT+0800 (China Standard Time)

I also probably won't be able to write any examples for a few weeks. But once I have some time I am willing to code up a few examples.

Nick Foti · Answer 5 · Fri Nov 30 2012 00:02:32 GMT+0800 (China Standard Time)

Finally have some time to start thinking about this again. I added a random-walk metropolis sampler that should work for distributions of any dimension and can use an arbitrary covariance matrix for the proposal. I haven't tested it yet and there is no error checking, but it's a start. The code is in the "metrop" branch of my fork. One potential problem is that I have to generate the random numbers as randn(1,d), where d is the dimension which means the type of x could change. So, that part of the code may need to be fixed. In the near future I'll review the posts in this issue and try to make separate issues for the major features we had talked about to keep them organized. I think with the simple debugger that just came out and plotting seeming to work on linux (I bit the bullet and installed julia in a vm for this) that development will be much less frustrating.

Chris DuBois · Answer 6 · Fri Nov 30 2012 00:19:24 GMT+0800 (China Standard Time)

Sounds good. I also have been taking another look at this lately. I agree that getting the metropolis sampler fixed up is important.

Instead of randn I think I might prefer to use the new interface in Distributions (unless it turns out to be drastically slower): rand(MultivariateNormal(mu, sigma)) (or something close to that). I might also prefer the name metropolis rather than metrop.

Nick Foti · Answer 7 · Fri Nov 30 2012 01:06:00 GMT+0800 (China Standard Time)

Sure, calling it metropolis is fine (I just based the name off of the
analogous R function) and we can try the Distributions module.

On Thu, Nov 29, 2012 at 11:19 AM, Chris DuBois notifications@github.comwrote:

Sounds good. I also have been taking another look at this lately. I agree
that getting the metropolis sampler fixed up is important.

Instead of randn I think I might prefer to use the new interface in
Distributions (unless it turns out to be drastically slower):
rand(MultivariateNormal(mu, sigma)) (or something close to that). I
might also prefer the name metropolis rather than metrop.

On Thu, Nov 29, 2012 at 8:02 AM, nfoti notifications@github.com wrote:

Finally have some time to start thinking about this again. I added a
random-walk metropolis sampler that should work for distributions of any
dimension and can use an arbitrary covariance matrix for the proposal. I
haven't tested it yet and there is no error checking, but it's a start.
The
code is in the "metrop" branch of my fork. One potential problem is that
I
have to generate the random numbers as randn(1,d), where d is the
dimension which means the type of x could change. So, that part of the
code may need to be fixed. In the near future I'll review the posts in
this
issue and try to make separate issues for the major features we had
talked
about to keep them organized. I think with the simple debugger that just
came out and plotting seeming to work on linux (I bit the bullet and
installed julia in a vm for this) that development will be much less
frustrating.

—
Reply to this email directly or view it on GitHub<
https://github.com/doobwa/mcmc.jl/issues/2#issuecomment-10853775>.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-10854551.

Nick Foti · Answer 8 · Fri Nov 30 2012 02:16:58 GMT+0800 (China Standard Time)

I updated the function names to metropolis_sampler and used the
Distributions module the generate the proposals. However, looking at the
code for the MultivariateNormal type (~ line 790 in Distributions.jl) there
will be extra overhead because a MultivariateNormal type needs to be
created. Additionally, a covariance must be specified when creating the
type, so the covariance has to be passed to the function. Alternatively we
could pass the Cholesky decomposition of the covariance to the metropolis
function, but then we have to compute the covariance. It seems better to
require the Cholesky decomposition to be passed so that people can do
whatever they want with the covariance, leave it constant, adapt it, etc.
However, to do this efficiently we should use randn so that we don't need
to compute the covariance as well inside the functions. If you have other
ideas I'm open to other solutions.

On Thu, Nov 29, 2012 at 12:05 PM, Nick Foti nfoti01@gmail.com wrote:

Sure, calling it metropolis is fine (I just based the name off of the
analogous R function) and we can try the Distributions module.

On Thu, Nov 29, 2012 at 11:19 AM, Chris DuBois notifications@github.comwrote:

Sounds good. I also have been taking another look at this lately. I agree
that getting the metropolis sampler fixed up is important.

Instead of randn I think I might prefer to use the new interface in
Distributions (unless it turns out to be drastically slower):
rand(MultivariateNormal(mu, sigma)) (or something close to that). I
might also prefer the name metropolis rather than metrop.

On Thu, Nov 29, 2012 at 8:02 AM, nfoti notifications@github.com wrote:

Finally have some time to start thinking about this again. I added a
random-walk metropolis sampler that should work for distributions of
any
dimension and can use an arbitrary covariance matrix for the proposal.
I
haven't tested it yet and there is no error checking, but it's a start.
The
code is in the "metrop" branch of my fork. One potential problem is
that I
have to generate the random numbers as randn(1,d), where d is the
dimension which means the type of x could change. So, that part of the
code may need to be fixed. In the near future I'll review the posts in
this
issue and try to make separate issues for the major features we had
talked
about to keep them organized. I think with the simple debugger that
just
came out and plotting seeming to work on linux (I bit the bullet and
installed julia in a vm for this) that development will be much less
frustrating.

—
Reply to this email directly or view it on GitHub<
https://github.com/doobwa/mcmc.jl/issues/2#issuecomment-10853775>.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-10854551.

Olli Wilkman · Answer 9 · Fri Nov 30 2012 02:20:21 GMT+0800 (China Standard Time)

A while ago I wrote a Julia implementation of the Goodman & Weare affine-invariant MCMC algorithm (http://msp.org/camcos/2010/5-1/p04.xhtml). It's still quite basic and a work-in progress. Just mentioning this here in case you find it interesting. It can be found at https://github.com/dronir/MCJulia

Chris DuBois · Answer 10 · Sun Dec 02 2012 06:01:06 GMT+0800 (China Standard Time)

@nfoti It seems like you can construct a MultivariateNormal by specifying the covariance and the Cholesky decomposition, and it wouldn't be too hard to add another constructor that only takes the Cholesky decomposition (since that's all that rand(mv::MultivariateNormal) requires. Maybe @johnmyleswhite might be interested in this? I'm not wedded to using MultivariateNormal over randn though.

@dronir Your project looks very interesting! Thanks for pointing us to it! I hope to try it out soon...

John Myles White · Answer 11 · Sun Dec 02 2012 06:47:15 GMT+0800 (China Standard Time)

@nfoti If you submit a patch for Distributions that takes only the Cholesky decomposition, I'll be happy to put it into Distributions.jl

Chris DuBois · Answer 12 · Sun Dec 02 2012 14:09:42 GMT+0800 (China Standard Time)

@nfoti Would you like to submit a pull request for the commits in your metrop branch, or are you still making changes? (It looks fine to me!)

Nick Foti · Answer 13 · Mon Dec 03 2012 11:07:51 GMT+0800 (China Standard Time)

Sure, probably not until tomorrow as I just arrived at NIPS and am still on
east coast time.

On Sun, Dec 2, 2012 at 1:09 AM, Chris DuBois notifications@github.comwrote:

@nfoti https://github.com/nfoti Would you like to submit a pull request
for the commits in your metrop branch, or are you still making changes? (It
looks fine to me!)

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-10926584.