Neal Radford and others had some interesting responses to my question about why Hamiltonian MCMC (HMC) might be better than Langevin MCMC (MALA). The gist of it seems to be that HMC is less random-walk like and thus mixes faster and has better scaling with number of dimensions.
Radford points to a survey paper of his (link) which discusses how the momentum distribution should be adjusted for changes in the scaling of the probability distribution (p. 22). This is something which I didn’t see last time I looked at HMC, and it’s necessary for an adaptive HMC algorithm. General use sampling algorithms can benefit a lot from being adaptive.
It also discusses tuning the step-count and step-size. This sounds rather difficult and non-linear.
I am going to try to implement an adaptive HMC algorithm in my multichain_mcmc package. I’d like to make this algorithm adaptive as I’ve done for my MALA implementation, though in general, this needs to be done carefully (see Atchade and Rosenthall 2005).
I’m interested in RM-HMC as it promises automatic scale tuning and better efficiency scaling with high dimensions, but it looks like understanding it requires differential geometry, which I haven’t yet worked through. I believe it also requires 2nd derivatives (which provide scale information), which I haven’t yet figured out how to implement in an efficient and generic manner for PyMC. I suspect that would require a fork and redesign of PyMC.

5 comments
Comments feed for this article
April 30, 2011 at 7:03 am
Michael Bauer
Martin Burda presented a paper about adaptive Hamiltonian MCMC at this year’s SBIES. It seems to have very favorable mixing properties in the examples and application he and his coauthor consider.
Check it out: http://apps.olin.wustl.edu/conf/SBIES/Files/pdf/2011/21.pdf
Best,
Michael
April 30, 2011 at 8:18 am
jsalvati
Thanks! That looks really interesting!
May 2, 2011 at 2:48 pm
Mark Girolami
It should be highlighted that the method described in
http://apps.olin.wustl.edu/conf/SBIES/Files/pdf/2011/21.pdf
has a fatal flaw in its development and it *does not* describe a correct MCMC sampling scheme. This is due to the authors incorrect use of a non-symplectic integration scheme. I’m not sure if what was presented at SBIES is what is described in the above pdf file though.
A correct adaptive HMC sampling scheme, amongst other methods, is fully described in the Journal of the Royal Statistical Society – Series B discussion paper
Riemann manifold Langevin and Hamiltonian Monte Carlo methods (pages 123–214)
Mark Girolami and Ben Calderhead
available at
http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9868.2010.00765.x/pdf
and Matlab scripts are available to replicate all examples given in the paper.
Any questions about the RMHMC methods then please get in touch.
May 2, 2011 at 2:58 pm
jsalvatier
Thanks for the correction mark! I see you must have a google search alert.
May 2, 2011 at 4:08 pm
Martin Burda
The pdf posted above contains a previous (now redundant) version of our paper. Mark Girolami helped me clear up a mis-step made there a while ago – I would like to thank him for that. Research is a learning process.
At the SBIES I have presented an updated version of the sampler that describes a correct MCMC scheme. John and I will circulate a full paper draft in the very near future to make sure everyone agrees. Please e-mail me if interested in the update – I will put you on my mailing list. Mark will be the first recipient and I hope he will confirm the correctness of our approach here.
Martin