One of my past bosses in galaxy evolution research used to love telling us post-docs and PhD students trying to make sense of our precious ‘reduced’ datasets (in this case, enormous galaxy catalogues compiling 100-200 different variables for each of 100,000 objects) to go away and plot “everything against everything”; the idea being to search for some hitherto unknown trends of one variable against another (usually for some huge collection of subsets of the galaxy population) to hopefully write a paper about. Reading up on the guidelines (e.g. QUORUM, PRISMA) for meta-analyses of clinical research today I couldn’t help be struck by the contrasts. In particular, the latter couldn’t be clearer about the importance of pre-specifying the hypotheses under investigation in order to control the Type 1 error rate.
I wonder whether any astronomical study has ever attempted to correct their significance claims for this common design flaw of our usual exploratory analysis phase? Seems unlikely.
[For a relevant review of significance level computation in the multiple testing setting see http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1713204/ %5D.