I noticed this paper on astro-ph today by Li, de Grijs, & Deng ( http://fr.arxiv.org/pdf/1309.0929 ) in which the authors describe a technique for learning the binary fraction of an observed star cluster (with unresolved binaries) by comparing mock data simulations against the real data. Sounds like … ABC!

In fact, their application ends up being closer to the pseudo-marginal method for latent variable models, since they suppose that the likelihood function is known conditional upon the input of the mock binary fractions in small set of bins. However, this assumption is clearly a bit ropey since it relies on the normal approximation for the error in each bin and neglects known sources of complicated uncertainty such as that owing to the “correction” for background field stars and the choice of benchmark CMD main sequence fit. A full ABC analysis would involve simulation from both field and cluster with the main sequence fit and binary fraction dependence on mass as input parameters (and ditto for higher order multiplicities) … and of course the comparison metric would no longer use the chi squared statistic. (And the priors would need to be explicitly defined.)

This type of simulation method has been used in other guises in the past (e.g. the Science paper by Sana et al. [cf. the methodology section]), but this time I feel the authors are getting even closer to ABC. For instance, they note that they have refined the binning used in the chi squared calculation since it was noticed that actually some of the bins were non-informative to the binary fraction and hence only added noise to the simulated-observed data comparison. ABC people will recognise this step as summary statistic refinement.

Despite only three citations so far to my ABC paper [and none yet for the Weyant et al. ABC paper … come on SNe cosmologists, what’s wrong with you?! 🙂 ] I nevertheless remain confident that the wider community will indeed pick up on this idea one day. (And I will be hailed as some kind of maverick visionary genius. Like the first schoolboy from Rugby to pick up the soccer ball and run straight at the opposing goal with it! Perhaps.)

### Like this:

Like Loading...

*Related*

I will be sure to cite your ABC paper if I use it one day! I have one project that’s been on the backburner for 5 years that would use ABC…

How do you suggest then that ABC will overcome the problem of its poor performance when using vague priors or that it is not even possible to do an objective-bayesian analysis because one cannot sample from improper priors like Jeffreys or reference priors?

Fair question. I think the SMC-ABC algorithm is perhaps the best we can do for improving this part of the ABC approximation, but even SMC ABC can become very slow to converge with truly vague priors. In this sense I think the only solution is to accept that ABC is a properly Bayesian algorithm requiring proper priors, and to choose these priors “cleverly”.

A relevant example occurred in an ABC model of chlamydia incidence I was working on recently: we wanted a model for the evolution of the incidence rate in a variety of age-sex cohorts over the past decade but didn’t want to choose a standard parametric form (e.g. linear trend), instead we extended the model parameter space to something like 240 dimensions, considering the rate in each age-sex cohort in each year as a model input. Our default option of “completely vague” priors (uniform prior on the rate) for each year in each age-sex cohort turned out to be impractical for ABC fitting purposes, but by adopting instead a Gaussian prior with Matern correlation function enforcing just a mild (but non-negligble) degree of correlation in these rates from year to year the effective dimensionality of the search space was reduced sufficiently to achieve ABC estimates on my 2 year-old Mac laptop! Looking back it was “obvious” that in fact we really did expect such correlations in the rate year-to-year “a priori”: by being vague we were really just being lazy!