“If you give a man a fish he is hungry again in an hour. If you teach him to catch a fish you do *him* a good turn. But, if instead you convince him that catching fish is far too complicated a business for a neophyte like himself and he’d be better off forgetting all about fish catching and just damn well pay you to do it for him, then you have done *yourself* a good turn.” — Anne “Astrostatistics” Thackeray

This famous quotation sprang to mind upon (re-)reading Ned Taylor’s paper on Bayesian modelling of the red/blue galaxy bimodality on astro ph today. Although I’d seen a near-identical early draft of this manuscript a couple of years ago I was stuck (again) by the sheer degree of obfuscation in the presentation of a truly simple hierarchical modelling problem: a generative model that could be written in a paragraph of hierarchical notation, directly transcribable into JAGS or STAN, is instead built up piece-by-unnecessariy-confusing-and-slightly-“patrician”-piece in a six page appendix plus great discussions in the main body (btw. these are two column small font pages). My impression is that a lot of astro-statisticians are working this angle to carve out a niche for themselves and create demand where it barely existed before; unflattering comparisons to nutritionists and life coaches spring to mind.

Having said that, I haven’t had the time to thoroughly check out the proposed model and fitting procedure, but I’d be surprised if there was anything wrong with them; and the idea of using a Bayesian mixture model to quantify the red and blue sequence is an entirely sensible one.

(The only thing that caught my eye on first inspection was the very wooly statement that asymptotically AIC ~ BIC ~ DIC ~ -2 log-Bayes factor. Sure, if you consider an *order(1)* approximation.)

### Like this:

Like Loading...

*Related*

Is there a reference that could have replaced the appendix? “Our hierarchical modelling follows the methods and notation of ?”.

I’ve often wondered whether the goal of conference talks by certain theorists is simply to confuse, and thereby convince the observers not to attempt to model it themselves.

The best introductory references for hierarchical modelling are perhaps simply Gelman et al’s Bayesian Data Analysis ( http://www.stat.columbia.edu/~gelman/book/ ) and Carlin & Loius’ Bayesian Methods for Data Analysis ( http://www.amazon.co.uk/Bayesian-Methods-Analysis-Edition-Statistical/dp/1584886978 ); while for measurement error problems one can point to Richardson (1996) ( http://www.stat.fi/isi99/proceedings/arkisto/varasto/rich0235.pdf ) and Carroll et al. (1995) ( http://www.stat.tamu.edu/~carroll/eiv.SecondEdition/ ); and the general problem falls under the domain of “mixtures of regressions” in the specific case of a normal response variable, see Viele & Tong (2002) ( http://link.springer.com/article/10.1023%2FA%3A1020779827503#page-1 ) or Gruen & Leisch ( http://cran.r-project.org/web/packages/flexmix/vignettes/regression-examples.pdf ).

Hi Ewan! The problem i have faced is trying to convince coauthors that what i am doing is sensible. Also, i have to admit that it took me about 3 years to teach myself all of this, mostly via wikipedia. Definitely the worst possible way to do things, but that’s the way we do things anyway. I think that the comment that i’ve received that’s made me the happiest so far is: ‘it all seems so simple to me now.’

I have no doubt at all that i am still a rubbish statistician. But i am trying to raise the bar. And like i said in the paper, my hope is that that appendix is useful to people who are starting out from the average (very, very low) level of statistical literacy.

I mean, you have to recognise that one of the coauthors – who is very well respected in the field – admitted that he had only performed his first maximum likelihood fit earlier this year! Sadly, that is just the state of the field.