I noticed this recent paper on the arXiv by Patel et al. entitled, “Orbits of massive satellite galaxies – II. Bayesian Estimates of the Milky Way and Andromeda masses using high precision astrometry and cosmological simulations”, in which the authors consider an inference problem on the periphery of current ABC & pseudo-marginal methods. In this case the authors are interested in estimating the virial mass of the Milky Way given observational estimates of some of the orbital elements (proper motion, radial velocity, etc.) with respect to Andromeda. These data enter their model as (independent) Normal likelihoods for the observations given the latent (‘true’) values of those observables; the latter being imagined drawn from a prior which is represented by the drawing of Milky Way-like galaxy groups from a cosmological N-body simulation. The catch here is that the simulation is expensive and is not able to be run directly by the authors, so all inferences must be made from a pre-compiled catalog containing a finite number of mock galaxy groups drawn from this prior.
This looks at face values like an ABC or pseudo-marginal problem in the sense that there is a reliance on mock datasets, but in fact there are (ignoring cosmological parameters, which are held fixed) no free parameters in the simulations: all the simulations are there for is to represent the prior on the latent variables of Milky Way-like galaxy groups: virial masses and orbital elements. For the present paper the authors approximate the posterior for the Milky Way virial mass using an importance sample re-weighting scheme formed by weighing each mock galaxy by the likelihood of observing a subset of its orbital elements. In one set of experiments these are the the satellite’s maximum circular velocity, its separation from the host, and its total velocity today relative to the host galaxy; and in another set of experiments these are the satellite’s maximum circular velocity and the magnitude of its orbital angular momentum.
In this scheme the posterior is behaving much like a kernel-density estimate in which the kernel acts on distance between the observed and latent orbital elements, using these to predict the virial mass. As one adds (informative) observations the number of mock galaxy groups contributing non-trivially to the estimate decreases such that although the posterior estimate will typically contract it be also be noisier. Indeed the authors recognise this effect and consider some of their decisions with respect to the effective sample size, although primarily their investigation here is in terms of sensitivity to the definition of a ‘Milky Way-like’ group (which is a hard threshold on the mock galaxy group catalog I’ve glossed over in the above).
Imaging the extension of this inference procedure to further applications, it would be helpful to have a general framework for deciding which orbital elements and other observational data to use and which to leave out. Such a framework would need to balance the information gain of each possible orbital element against the reduction in effective sample size over the available simulations; and would presumably be based on cross-validation estimates of bias and coverage for mock galaxy groups in the simulation sample. This is of course a very similar problem to summary statistic selection in ABC analyses and I’m particularly reminded of the paper by Nunes & Balding (2010) in which those authors run a preliminary ABC analysis with unrefined summary statistics to identify the important region of parameter space over which the summary statistics should be optimised.
Another way to attack this problem would be via a flexible parametric model for the prior as in the RNADE-based scheme of Papamakarios & Murray (2016). Indeed this looks very similar to the example problem described (no slides available as far as I can see) by Iain Murray at the MaxEnt 2013 conference in Canberra.