Faster, Pussycat!

No recent posts, partly on account of having spent a lot of time (including outside of office hours!) finishing off some malaria maps we’ve had in production for a while now (P. vivax incidence & P. falciparum infection types).  To be fair, I’ve also spent a lot of time reading Markov chain theory (both discrete & general state space) and I contributed a long post of my thoughts on Bayesian model averaging in astronomy to Xian’s Og.

The only astrostatistics paper I’ve had the time to read lately was this arXiving by Bernstein et al. on “the Bayesian Fourier Domain method” for estimating (for now) shear in weak galaxy lensing.  The methodology described (of which this paper is the extension to the basic scheme outlined in an earlier publication) is an interesting example scale-able Bayesian inference in the sense I feel is most relevant to applied science—namely, constructing fast, analytic approximations to the likelihood to achieve super fast, but doubly-approximate Bayesian estimates (generally without Monte Carlo samplers, or with very basic ones)—but which is quite different to the focus of scale-able Bayesian inference in statistical research at present—namely, constructing faster approximate versions of  Monte Carlo samplers that take the full likelihood function (e.g. subset posterior Bayes; consensus MC; adaptive sub-sampling MCMC).

In Bernstein et al. the approximation proceeds with three simplifying steps: to assume that the noise process is additive & stationary, that the data can be usefully summarised in just the (pixel-deconvolved) zeroth and second image moments and its likelihood represented as such, and that the lensing is weak such that a second-order Taylor expansion suffices.  And, much like in the subject of my previous blog post (“On path integral evidence…” below), the result is an approximate Gaussian likelihood which can be marginalized quickly over the (empirical) model family of ‘true’ galaxy images to produce a posterior mean and variance estimate on the (two-dimensional) shear vector.  Despite the ‘simplistic’ assumptions in this method, which are openly highlighted by the authors in Section 5 & to which I’d suggest adding ‘precise knowledge of the PSF’ (since this in particular was a surprising assumption for me), the authors demonstrate their code runs extremely quickly (10 galaxies per second per core) and does a very accurate job at recovering shears for simulated galaxy images.

One issue in the presentation that I found confusing was that it seems from the introduction that the authors’ competitive advantage over maximum likelihood-based methods for shear estimation using parametric galaxy models came from keeping the full likelihood information (or at least an approximation to it) rather than returning a single point estimate of the shear plus a a point estimate of its uncertainty.  But in fact the method also reduces to the same, but averaged over the family of candidate galaxy images.   One would imagine the next step in using these shear estimates to do lens reconstruction in a Bayesian framework would be to constrain a model for the weak lensing mass distribution—perhaps some kind of Gaussian process—against these ‘posterior’ summaries for the shear of each galaxy.  That is, these ‘posterior’ summaries are really playing the role of likelihood summaries here with the actually shear prior not being the uniform prior suggested by the authors, but rather the a priori distribution of shears implied by the lensing model.  Upon using these shear likelihoods for inference effectively implies the construction of a shear posterior for each galaxy, which (though uninteresting compared to the lensing posterior itself) is ultimately updated according to the shear likelihoods of its neighbours (over a suitable correlation length).

New version of LibBi
In related news, a friend & colleague of mine here at Oxford, Lawrence Murray, has recently released a new version of his high performance code (that is, optimised for high performance hardware; GPUs, clusters, etc.) for Bayesian inference powered by sequential Monte Carlo (SMC) sampling.  The new version of LibBi is apparently Turing complete and comes with a variety of example applications, including one to an Ornstein-Uhlenbeck process which could well be useful for astronomical applications (since this model has been used in, e.g., the study of blazers and transient astronomical signals: Sobolewska et al., Kumar et al.).

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s