I recently became aware, more-or-less simultaneously, of two new endeavours intended to teach Bayesian statistics to the next generation of astronomers—the 2014 Canary Islands Winter School and the new textbook by Ivezic et al., “Statistics, Data Mining, and Machine Learning in Astronomy“. Indeed the Canary Islands school aims: “to provide the next generation of researchers with the tools of modern Bayesian data analysis that will become a standard in future research in astrophysics.” And likewise the book (apparently) “provides a comprehensive and accessible introduction to the cutting-edge statistical methods needed to efficiently analyze complex data sets from astronomical surveys such as the Panoramic Survey Telescope and Rapid Response System, the Dark Energy Survey, and the upcoming Large Synoptic Survey Telescope.”
However, my impression from reading through the book and from reading the topics of the school programme is that both resources will at best bring the neophyte up to the level of an undergraduate student who’s taken a couple of courses in statistics as part of their science degree but in no way majored in the subject. The vast majority of the book reads like a dumbed down version of the first couple chapters of Gelman et al.’s “Bayesian Data Analysis” and gives only the most trivial descriptions of, e.g., model selection and time series analysis; and the Winter School also includes a large component on introduction to Bayesian statistics plus I would guess simplified introductions to MCMC, nested sampling, and thermodynamic integration. Either way there’s a long way to go from taking a first look at these few topics to being able to comprehend, let alone master, any cutting-edge statistical techniques.
How would one explain a cutting-edge idea that could indeed be highly useful for astronomy to a recent graduate of the book or course? It’s just not possible without an intermediate and advanced follow up. For instance, I was recently reading the paper by Rao & Teh (2013) on MCMC sampling for Markov jump processes which I could imagine being applicable to unusual time-variable astronomical systems (like soft-gamma ray repeaters). But here we have to explain to the student what is a discrete time Markov Chain, what is a continuous time Markov Chain, what are the Chapman-Kolmogorov equations, what is a stochastic process, what is a latent variable, what is a measure, what is a product measure, what is a sigma-finite measure, what is a density understood as a Radon-Nikodym derivative, what is a particle filtering algorithm, and on it goes.
And this is perhaps the root of the problem: that we need to teach our students—and indeed 99% of professional astronomers would benefit from the book and course as well—from this basic level; that the foundations of statistical inference and standard Monte Carlo methods are not already taught to them at undergraduate/honours level! (And god forbid we expect a physical scientist to have taken Analysis I/II and Algebra I …)
Conversely, I’ve seen a lot of highly mathematical/statistically-literate astronomers forced out of the field (and in some cases out of academia altogether) by lack of recognition for their skills and, hence, lack of job opportunities. But who knows, perhaps it is just a generational thing, such that if through these books and schools we can get this next generation up to Chapter 4 of Gelman et al, then the one after that up to Chapter 8, and so on, then maybe the system will be self-sustaining from then on?