## The usual binomial proportion cock-up …

Having published a how-to guide for computing credible intervals on binomial population proportions in astronomy my troll instincts are particularly keen for detecting mistakes with the binomial in new astro ph submissions.   Today’s fail comes from the WFC3/UDS paper of Bassett et al. in which the authors aim to assess the significance of evidence for a difference between two rates: one with observed population proportion 6/16 and the other with 8/43.  Their claimed significance level is ~2.5 sigma based on “binomial statistics”.  Reverse engineering their thinking it seems only possible to achieve this 2.5 number by supposing 6/43 is observed (rather than 8/43) and by treating the null hypothesis as this rate, 6/43.  The corresponding R code being: significance_sigma = qnorm(1-(1-pbinom(5,16,6/43))/2).  Neglecting what I’m assuming was a typographical mistake (6->8) the problem with this approach is that the null hypothesis is assigned using only a fraction of the available data, i.e., choosing one of the two samples to fix the null rate.  The conventional approach therefore is to estimate the null rate from the pooled samples as (6+6)/(43+16) = 0.203… giving a significance_sigma =  qnorm(1-(1-pbinom(5,16,0.203))/2) = 1.7 sigma, which is much less impressive.

[For reference, the equivalent Bayes factor in favour of a different rate (under the uniform prior on each rate) is just 1/(choose(16,6)*choose(43,6)/choose(16+43,6+6)*(16+1)*(43+1)/(16+43+1)) = 1.83… i.e., non-“significant”]

More generally though, I would suggest that this type of study raises some important concerns regarding experimental design … with the effects of environment on galaxy colour and morphology known a priori to be extremely subtle it seems surprising that the authors did not apparently seek to estimate the necessary sample sizes to answer their key science questions *before* proceeding to the data reduction, mass/SFR estimation, GALFIT fitting etc. stage.  And now to wheel out Fisher’s old chestnut, “To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of.”