Xtreme deconvolution …

A sociological observation segueing into a note on Drton & Plummer.

I noticed on today’s astro ph a paper on ‘extreme deconvolution’, which is an astro-statistics term for fitting a Normal mixture model to noisy data; I’m not sure if the technique is extreme per se of if it needs to be applied to a large dataset (as in a pioneering example) to properly garner the complete appellation.  In this instance the application envisaged (and with a small example) is for modelling the multi-variate distribution of supernovae in the SALT2 dataset which I believe to be on the order of 800 objects.

So, the sociological observation I have is that the new generation of astronomers—let’s call them Gen Y, while defining my 1981 birth year as the (left closed, right open) boundary of Gen X—seem to view their wrapper scripts for running functions as a worthwhile contribution to the published literature.   That is, the paper at hand doesn’t present a new algorithm or methodology for performing extreme deconvolution, or add any novel contribution to thinking about extreme deconvolution methodologies in astronomy, rather it simply describes the authors’ wrapper script for running either of two existing extreme deconvolution models and computing conditional densities from their output.  The latter itself is simply the application of the well known rules for  manipulating Multivariate normals which are not in any sense difficult to implement—and certainly not to the extent that I could imagine anyone seeking a third-party application rather than just looking up wikipedia for themselves.  But I guess if editors are happy to publish it then c’est la vie.

Another thing that surprised me about this paper was that there are two methods for selecting the number of mixture components presented—the BIC and a cross-validation-based mean log-likelihood method—but the motivations of each are barely discussed and nothing is mentioned of the (order 1) nature of these approximations (especially so when we’re in the extreme deconvolution case and the observational uncertainties mean that n is not quite n as envisaged by Schwarz). Moreover, it’s observed that on the toy example the BIC points towards 5 components and the cross-validation method points towards something greater than 10, but you should probably use the BIC because it’s too expensive to add lots of components.  So, uh, yeah.

Having got that off my chest I thought it worth pointing out the existence of the recently ‘read’ (at the RSS) Drton & Plummer paper looking at the problem of model selection for the case of nested singular models, of which Gaussian mixtures (sans observational noise) is one of the given examples.  In particular, the authors offer a strategy for applying Watanabe’s method to this problem that side-steps the self-defeating requirement of knowing in advance which is the true model, but does not side-step the requirement to be able to identify the ‘learning rate’ of each model.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s