Bayesian Simulation Sampling: A pseudo-marginal method by any other name?

In Venice today, after a great week in Sesto/Sexten attending the Varying Constants & Dynamical Dark Energy conference.  I feel like I delivered my talk well, and there were a decent number of questions, so … job done!

I noticed two Bayesian astrostatistics papers on astro ph yesterday that deserve comment.  The first was a Bayesian model selection paper by Verde et al. ( ) using the Savage-Dickey Density Ratio estimator; the implementation seems fine but I find it amusing in light of the measure-theoretic awkwardness of the SDDR (e.g. Marin & Robert 2010) when the authors write: “It may seem surprising that such an apparently complicated ratio of integrals has such a simple expression” … oh boy, you can say that again!

The other was the “Bayesian Simulation Sampling” (BSS) paper by Fardal et al. ( ), with the application being to the modelling of satellite disruption in M31.  The computational N-body simulation model and its comparison to the observed dataset are clearly impressive; but I do feel that the so-called “Bayesian Simulation Sampling” method could have been put more correctly into its statistical context: not least because this allows the consistency and convergence rate results of the past work to be directly applied.

The algorithm used here is basically that of the pseudo-marginal approach for latent variable models (e.g. Beaumont 2003; Andrieu & Roberts 2009); and is related in turn to the stochastic EM algorithm (e.g. Nielsen 2000; and many earlier).  That is, write the likelihood in latent variable form: L(y|theta) = int L(y|z) f(z|theta) dz … where we know L(y|z) exactly, but can only simulate from f(z|theta).   The BSS method is then to do MCMC under a single draw MC approximation of the above likelihood integral.  Two observations immediately spring to mind: (1) with the expectation of this single draw MC integration obviously equal to that of the target integral, the chain will eventually converge to the true posterior; but (2) if the variance of this MC integration is large then the convergence to stationarity can be extremely slow.  [A colleague of mine suggests a first-order improvement would be to take the average L(y|z) under several draws from f(z|theta).]

So … if the variance of the above MC integration step is in fact small for all theta (as it appears to be in this case) then the method is not interestingly different from regular (full likelihood-based) MCMC.  While if it is much smaller, then the method as stated is hopelessly inefficient.  In the case that simulations of z from f(z|theta) are computationally expensive I would imagine that a better means to approximate the posterior would be via a Bayes linear / Gaussian process approximation to the likelihood function (a la Vernon, Goldstein, & Bower 2010).

That’s my two cents, anyway!  Off to eat gelati and get offended by the ugly nonsense that non-European tourists are obliged to wear on European holidays 🙂

UPDATE: Since all good things come in twos or threes there was, of course, another paper using the same type of latent variable simulation method on astro ph now (a couple of days later).  This one by Hainline et al. ( ) does lensing fits using simulations at thetas spaced along an initial coarse grid, followed by further simulations at thetas more finely spaced around the high posterior density region.  The potential for adaptive importance sampling here should be obvious.

This entry was posted in ABC, Astrostatistics, Rants, Statistics. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s