For a while now I’ve been an advocate for greater recognition of the potential uses for geospatial statistical tools in astro-statistics (e.g. at the IAUS306), so I was mostly happy to see this paper by Yu et al. on astro ph this week presenting an application of kriging to interpolate cosmic velocity fields. Mostly happy, but not entirely happy, because although it’s a step in the right direction, the particular technique of kriging is more-or-less the weakest tool in the geostatistical toolbox … and it’s almost identical to the Wiener filter method already widely used in cosmology.
As a Bayesian my preferred explanation of kriging is that from Diggle & Ribeiro in which they examine the simplest possible geostatistical model—the stationary GP with iid Gaussian observational noise—and identify, via standard operations for conditioning the multivariate normal, the minimum mean square error predictor for the latent field, which corresponds to simple kriging. They then go on to explain the difference between simple and ordinary kriging, note the limitations of the kriging estimator for predicting non-linear targets (functionals of the latent field), and give a few illustrative examples. In a subsequent chapter they explain how to do Bayesian inference—and this is where I think we need to begin for geostatistics to add any value to astro-statistics: with Bayes!
Notably, the way Yu et al. structure their prediction space is via a Voronoi tessellation about the observed galaxies in their sample: this happens to correspond neatly with the finite element SPDE solution offered by the powerful INLA (Integrated Nested Laplace Approximation) package which has revolutionised geostatistical modelling in fields like ecology, epidemiology, and geography. With this approach one can perform fast Bayesian posterior inference on highly complex hierarchical models with latent Gaussian fields (e.g. non-Normal observational noise, non-stationary etc.), thereby both representing faithfully one’s uncertainty in the fit and regularisation (i.e., the priors on the hyper parameters of the correlation function), hopefully thereby to improve out-of-sample performance/generalisation! Interestingly, as near as I can decipher the text, the lack of uncertainty in (traditional reporting of) the kriging prediction is the root problem addressed by the ravings on this unusual website.
A related development is that Torsten Ensslin has recently written a summary of some issues that came out of our disagreement about the nature and mathematics of Information Field Theory. Overall, this does clear up much of the ambiguities I felt were left in earlier explanations of the IFT paradigm; as such I suppose where our opinions diverge remains the question of what is the best and/or correct direction to proceed in when constructing a model to represent a random field in a statistical inference problem—whether to begin with equations describing a pixelised field (and hopefully in the limit a continuum field) as advocated under IFT, or to begin from an infinite stochastic process prior over a continuum field with decisions about discretisation (if necessary) to be taken only in concession to the data or computational limitations as in Bayesian geostatistics.
Next on my to-do list: to read up on boosting to see why both IFT like methods and Bayesian geostatistics are so consistently outperformed by boosting in kaggle competitions … ?