(2) the SPDE approach to large-scale random fields.
Cosmologists have long been familiar with the statistics of random fields, owing in part to the prominent role of the Gaussian and log-Gaussian in models of the initial density perturbations (Bardeen et al. 1986) and large-scale structure (Coles & Jones 1991), respectively—and indeed cosmologists have in the past led the way to development of key techniques for excursion set analysis and efficient simulation from Gaussian fields (e.g. Arnoud Doucet credits Hoffman & Ribak 1991 for introducing one of the most widely used tricks for sampling the posterior at indirectly constrained locations). However, in the past decade the thrust of new ‘applied statistics’ ideas in random field theory have been driven by geostatistical research, where the embedding of Gaussian fields within the Bayesian hierarchical modelling framework has proven an exceptionally powerful technique for mapping continuous spatial signals (cf. Diggle et al. 1998, Banerjee et al. 2004 and examples in epidemiology: Diggle et al. 2002, Kelsall & Wakefield 2002, Gething et al. 2010). And it has been within this context of geostatistical modelling that the SPDE (stochastic partial differential equation) approach to the modelling of large-scale random fields has lately emerged as a revolutionary new trend.
By far the greatest challenge for computational solution of large-scale inference of Gaussian random fields is that of matrix factorisation for their dense covariance structures. The SPDE strategy described by Lindgren et al. (2011) identifies an explicit link between general Gaussian random fields and Gaussian Markov random fields (GMRFs), and makes use of this link to develop mesh-based approximations in the style of the finite element method; the result of which are sparse precision matrix proxies for non-sparse Gaussian random field parent models. The computational advantages of this widely-applicable approach (see also Simpson et al. 2011 for key examples) are immense—and even more so when coupled to the Integrated Nested Laplace Approximation (INLA) strategy for GMRF posterior approximation (Rue et al. 2009) which solves the other key challenge of Gaussian field modelling: efficient exploration of the joint hyper-parameter–latent field space.
With each observation giving rise to its own latent random variable, in large observational studies (or observational studies gridded over large areas/volumes) the dimension of the posterior in Gaussian field models can easily be in the hundreds, thousands, or hundreds of thousands! As conventional MCMC schemes quickly begin to struggle at the lower end of this spectrum the now ‘conventional’ approach is to use Hamiltonian MCMC (Neal 1997, Girolami & Calderhead 2011, Neal 2012; and for an astrophysical example: Jasche & Kitaura 2009). However, for the GMRF-based proxies provided by the SPDE method INLA proposes a super-fast alternative: since the Gaussian prior typically leads to a near-Gaussian posterior for the latent field why not take the Laplace approximation (Tierney & Kadane 1986) here and run numerical quadrature over the remaining hyper-parameters? It turns out that this MCMC-free algorithm is more than sufficiently accurate for a lot of practical inference tasks—and, importantly, it runs a heck of a lot faster (we’re talking hours on a desktop compared to weeks on a cluster in some cases)! Recent successful applications of the SPDE / INLA methodology outside of astronomy include the mapping of global ozone levels (Bolin & Lindgren 2011) and air pollution (Cameletti et al. 2013, Clifford et al. 2013), bovine diarrhoea(!) (Schroedle et al. 2011), forest fires (Natario et al. 2013), and disease genealogies (Palacios & Minin 2012), amongst others. Important to note is that SPDE / INLA can also be used for time series modelling (cf. Martino 2008).
An example SPDE approximation (Lindgren & Rue 2010).
References/Notes. The most ‘obvious’ potential application of the SPDE / INLA method in astronomy would be for mapping large-scale structure (and inferring its correlation function). Where the currently-used methods are either slow and computationally expensive (e.g. Jasche & Wandelt 2012, ) or rely on Normal approximations to the data generating process and a regular-gridded cubic domain for speed ups from FFT-based matrix manipulations (e.g. Jasche & Lavaux 2014), SPDE / INLA could do the same job quickly and cheaply, on possibly irregular grids, and without sacrificing the real (or realistic) latent field-to-data link function (here: Poisson point process). However, there are likely many other potential applications of SPDE / INLA within the astronomical domain: for modelling simultaneous the systematics in time-series of many thousands of targets? for fast mapping of the CMB / pixel in-painting? for faster model emulation or experimental design? Only time can tell (as Richard Hell might say) …