While checking up on astro ph today (as distraction from my reading on efficient simulation from huge GP models) I noticed this embarrassingly awful contribution to astronomical model selection by some Italian faculty (at first I thought masters/PhD students): http://arxiv.org/pdf/1311.2736.pdf . The authors aim to compute the marginal likelihoods for two competing models: one with two parameters and the other with three parameters, so it would be easy enough to do these integrals directly via quadrature. But instead they try out and compare the harmonic mean, Laplace approximation, thermodynamic integration, and nested sampling; ostensibly for pedagogical purposes. However, there’s very limited discussion of these methods and crucial missing details of their implementation, but *most worryingly* all return very different Bayes factors (quoted with no uncertainties) … and the authors don’t even investigate why this is!!!! E.g. from their table 5 the four methods for estimating the same Bayes factor give 1418, 145, 586, and 171 (in the same order as above). It’s bizarre. No surprises that the HME is a fail, but what has happened to thermodynamic integration? My suspicion is that they’ve chosen a linear temperature sequence (perhaps also too coarsely spaced) where a logarithmic one (cf. Friel & Pettitt 2008) would have been appropriate. But we’ll never know because they don’t discuss this. They also advocate uniform and Jeffreys priors for model selection …

Amongst various other things that annoyed me about the above paper is a common grievance of mine: that astronomical citations are too often given for standard statistical concepts. With this in mind I thought it might compile (and keep updated with all suggestions welcome) a list of transformations for converting between the neophyte’s astronomical reference system and the old hand’s statistical one.

the Laplace approximation for marginal likelihood estimation:

Gregory (2005); Ntzoufras (2009) -> Tierney & Kadane (1986)

the Harmonic Mean Estimator:

Ntzoufras (2009) -> Newton & Raftery (1994)

parallel tempering:

Gregory (2005); Handberg & Campante (2011) -> Geyer 1994

the thermodynamic integration identity:

Gregory (2005) -> Gelman & Meng (1998); Lefebvre (2010)

thermodynamic integration in practise:

Gregory (2005) -> Lartillot & Phillipe (2006); Friel & Pettitt (2008)

the Bayesian approach to hypothesis testing:

Gregory (2005) -> Jeffreys (1935,1961); Kass & Raftery (1995)

On the plus side today there was a nice paper by Gomez et al. (including Wolpert as guest statistician) on a sensitivity analysis via emulators for galaxy formation models, which I intend to read further when I get a chance: http://arxiv.org/pdf/1311.2587.pdf