Two recent arXivals to catch up on here, aided in part by a more-than-usually-tedious bus journey from Witney to Oxford this morning (the culprits being ongoing road works in Witney and excess traffic attracted by filming of the Countryfile tv show at Blenheim). The model emulation paper is by Schmit & Pritchard and concerns estimation of cosmological model parameters given observations of the 21cm signal associated with the Epoch of Reionisation. The challenge is that to compute the likelihood one needs to be able to compute the power spectrum for any given set of model parameters, requiring either a very expensive ‘numerical simulation’ or a somewhat less expensive ‘semi-numerical simulation’. Meaning that in either case direct MCMC is rather costly because at each proposal a new simulation must be run. (Statisticians, note that both the numerical and semi-numerical simulations give deterministic outputs, so we’re not in the realm of pseudo-marginal MCMC or ABC.) Naturally then, there is interest in application of model emulation techniques to make efficient inference on this problem.
The specific approach to emulation taken by Schmit & Pritchard is to emulate at the level of the latent (deterministic) variable: namely building an artificial neural network (ANN) model for the power spectrum given a training set of power spectra evaluated on a Latin hypercube of parameter values. This is the same direction as explored by Kern et al. (2017) [see my discussion here]—but is in contrast to the version of model emulation most common in statistics (see, e.g., the review by Levy & Steinberg 2011) in which the emulation is of the objective surface (e.g. log likelihood function) itself. Not surprisingly (since ANNs and GPs can theoretically represent a similar class of functions to a similar level of accuracy), Schmit & Pritchard find that like Kern et al. they too are able to produce an accurate emulator for the ‘semi-numerical’ model and can use it to form an approximate posterior matching reasonably well the full MCMC posterior.
I think this is an interesting direction of research but would be surprised if it can outperform the more conventional type of model emulation (which goes under the term “Bayesian optimisation” in machine learning studies) for posterior sampling problems (use case ii and effectively iii of Schmit & Pritchard) owing to the latter’s advantage of having a continually updated estimate of the posterior along with its uncertainties. Thereby allowing for efficient design-based addition of new training points one-by-one against a utility function capturing chosen aspects of posterior fidelity. Moreover, the ability of multi-task (or multi-fidelty) Bayesian optimisation to learn across similar objectives would naturally fit into a scheme whereby both ‘semi-numerical’ and ‘numerical’ simulations are used judiciously. Where I do agree that these emulations in the latent variable space may have a competitive edge is in their use case i: experimental design for 21cm studies (i.e. determining optimal survey layouts and depths to facilitate inference).
The quantile regression paper is by Kacprzak et al. (2017) who present a strategy for improving the efficiency of ABC posterior approximation relative to simple rejection sampling. The idea being to use successive waves of quantile regression to identify nested regions of parameter space outside of which mock datasets with acceptable epsilons are extremely unlikely to be produced; an approach which they note draws some inspiration from history matching (coincidentally, part of a model emulation strategy previously used for training expensive semi-analytical models of galaxy formation). Since it relies on quantile regression and the estimation of confidence bounds for the regression this is a technique appropriate in the intermediate regime of ABC in which rejection ABC is taking longer than one would like yet we can still generate a decent number (10s of 1000s) of mock datasets during the course of our posterior approximation. In this context I don’t see the appeal of this strategy over sequential Monte Carlo samplers for ABC which can take on board simple parameter distributions within their refreshment kernels such as would have similar flexibility to a quantile regression with first and second order polynomial terms (for a single component Normal) or higher (for a Normal mixture). (Note that the particular quantile regression scheme used by Kacprzak et al. is a SVM-based method for which the required sample size for accurate quantile regression confidence intervals is not readily apparent; but the sizes used here in each round would only just be appropriate for low order polynomial fits at the targeted quantiles; see e.g. Kocherginsky et al.). The advantage of the SMC algorithm though is that it’s targeting the full ABC posterior with known convergence behaviour. Certainly it would be easy for the authors to compare against an off the shelf SMC algorithm thanks to (e.g.) EasyABC, or Ishida et al. (2015) or Jennings & Madigan (2016).