I had a quick read of the new whitepaper from which this post borrows its title. The main recommendation is for greater funding towards astrostatistics/informatics education which I would generally support. Except that I feel one of our main barriers to progress in the field is with regards to under-valuing deep scholarship in applied statistics (and applied mathematical/numerical methods generally): if I would imagine what more astrostatistics education looks like it’s presumably going to be giving a wide audience a very basic introduction to Bayesian inference and showing them how to run an MCMC code and a nested sampling code, then send them on their way. What I’d like to see is more focussed funding on advanced methods, e.g. pay 10s of PhD students per UK-sized country per year an extra stipend to allow them to focus for one calendar year solely on taking an advanced course in (one of) time series modelling, foundations of probability theory, stochastic processes, etc. More challenging is to effect a cultural shift such that hiring decisions can become more meritocratic and less nepotistic. Some of the astrostatisticians I’ve seen have leave the field for want of viable employment opportunities did everything that one would expect to achieve a tenured position: develop new astrostatistical methods, publish highly used astrostatistical open-source software, make applications to challenging applied astrostatistical problems, publish a bunch of highly cited papers. Meanwhile, some of the actual morons you see get tenure, even at top ranked universities … urgh!

I also read a paper on a Bayesian approach to distinguishing signals from noise in gravitation wave detector data. After spending the time to read it I realised that I’d accidentally allowed through noise as if it was signal: the method presented sidestepped all the interesting challenges of this problem just to show that Bayesian model selection in a simple well-specified scenario works pretty well. The obvious challenge for BMS in the GW setting is with regards to introducing an effective noise model that is flexible enough to adapt to reflect the glitch distribution while allowing effective population level inference of faint GW signals at an efficient learning rate. One could think of a mixture of a semi-parametric glitch model and a parametric quasi-coherent glitch model; with the former acting something like a Bayesian bootstrap, thereby avoiding to throw the baby out with the bathwater.

Edit: I should also add that I quickly skimmed another paper by the same group on the efficiency of parallel nested sampling which mentioned earlier work of theirs on reweighting chains under efficient reference likelihoods. The latter is an approach I’ve been advocating for some time; and the original idea goes back to even before Hastings (1970). But at this stage it doesn’t look like many people have realised one can also use a reference distribution for nested sampling other than the prior: e.g. one can run a reference distribution approach and when it’s effective sample size is small, improve on it by nested sampling. E.g. start from a “prior” that is a mixture of the real prior and a parametric approximation to the reference likelihood posterior.

Yes, foundations of probability theory (Jaynes) and foundations of bayesian machine learning (Bishop). Not sure it’ll ever happen. Too hard for most…