Training the next generation of astrostatisticians …

I recently became aware, more-or-less simultaneously, of two new endeavours intended to teach Bayesian statistics to the next generation of astronomers—the 2014 Canary Islands Winter School and the new textbook by Ivezic et al., “Statistics, Data Mining, and Machine Learning in Astronomy“.  Indeed the Canary Islands school aims: “to provide the next generation of researchers with the tools of modern Bayesian data analysis that will become a standard in future research in astrophysics.”  And likewise the book (apparently)  “provides a comprehensive and accessible introduction to the cutting-edge statistical methods needed to efficiently analyze complex data sets from astronomical surveys such as the Panoramic Survey Telescope and Rapid Response System, the Dark Energy Survey, and the upcoming Large Synoptic Survey Telescope.

However, my impression from reading through the book and from reading the topics of the school programme is that both resources will at best bring the neophyte up to the level of an undergraduate student who’s taken a couple of courses in statistics as part of their science degree but in no way majored in the subject.  The vast majority of the book reads like a dumbed down version of the first couple chapters of Gelman et al.’s “Bayesian Data Analysis” and gives only the most trivial descriptions of, e.g., model selection and time series analysis; and the Winter School also includes a large component on introduction to Bayesian statistics plus I would guess simplified introductions to MCMC, nested sampling, and thermodynamic integration.  Either way there’s a long way to go from taking a first look at these few topics to being able to comprehend, let alone master, any cutting-edge statistical techniques.

How would one explain a cutting-edge idea that could indeed be highly useful for astronomy to a recent graduate of the book or course?  It’s just not possible without an intermediate and advanced follow up.  For instance, I was recently reading the paper by Rao & Teh (2013) on MCMC sampling for Markov jump processes which I could imagine being applicable to unusual time-variable astronomical systems (like soft-gamma ray repeaters).  But here we have to explain to the student what is a discrete time Markov Chain, what is a continuous time Markov Chain, what are the Chapman-Kolmogorov equations, what is a stochastic process, what is a latent variable, what is a measure, what is a product measure, what is a sigma-finite measure, what is a density understood as a Radon-Nikodym derivative, what is a particle filtering algorithm, and on it goes.

And this is perhaps the root of the problem: that we need to teach our students—and indeed 99% of professional astronomers would benefit from the book and course as well—from this basic level; that the foundations of statistical inference and standard Monte Carlo methods are not already taught to them at undergraduate/honours level!  (And god forbid we expect a physical scientist to have taken Analysis I/II and Algebra I …)

Conversely, I’ve seen a lot of highly mathematical/statistically-literate astronomers forced out of the field (and in some cases out of academia altogether) by lack of recognition for their skills and, hence, lack of job opportunities.  But who knows, perhaps it is just a generational thing, such that if through these books and schools we can get this next generation up to Chapter 4 of Gelman et al, then the one after that up to Chapter 8, and so on, then maybe the system will be self-sustaining from then on?

This entry was posted in Uncategorized. Bookmark the permalink.

6 Responses to Training the next generation of astrostatisticians …

  1. So basically your only objection is to the term “cutting edge”?

  2. Hi Ewan, I’m one of the organizers of the Winter School and, fundamentally, I agree with you. One of the reasons why we are organizing it is because students that come to research (at least to our institute), have the idea that statistics is just computing the mean and the standard deviation with IDL. We want this to change and our “master plan” is to start with PhD students, with the idea of hopefully propagating this interest back to the university.

    • Fair enough! I guess a different point i could/should have made is that there’s a lot of amazing work we could be doing with advanced statistical methods in astronomy if only we had the right people; but that would require that the funding of astronomy would recognise the value of properly analysing the expensive data we gather.

  3. “but that would require that the funding of astronomy would recognise the value of properly analysing the expensive data we gather.”

    I do think this is becoming increasingly recognised. It just should be faster! At the risk of making myself obsolete 😉 You’re making me regret not enrolling in one of the stochastic process workshops at ISBA.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s