A convexified mixture model …

Just a brief note to point out this recent arXival by Leistedt & Hogg in which the color-magnitude diagram is used to help refine (parallax) distance estimates from Gaia via a Bayesian hierarchical model.  The idea is that although a star’s apparent magnitude and colour tell us very little about its distance in isolation they can help to provide a substantial amount of Bayesian shrinkage (via the absolute magnitude-colour diagram) given a large enough collection of stars with additional noisy distance measurements (that one may hope to refine).  Bayesian shrinkage (see David van Dyk’s talk from IAUS306) is a powerful idea in hierarchical modelling and, although now used fairly widely in photometric redshift studies and galaxy classifications, its potential certainly has not been exhausted in any sense.

What I found particularly interesting here was the form chosen to represent the density of (true/latent) stellar positions on the (absolute) colour-magnitude diagram, which was a Normal/Gaussian mixture model in which the means of each component are placed in a grid over the 2D parameter space, the standard deviations of the components are all held fixed, and only the vector of component weights is to be inferred.  The motivation given for this modelling decision is that it renders the inference process easier, in particular because it makes it easier to marginalise out the latent variables in the Gibbs sampling step.  In fact the benefits of such a model run even deeper by way of conferring convexity to the mixture model construction as known from studies of this model from the perspective non-parametric maximum likelihood estimation (e.g. Feng & Dicker 2016).

Edit: While cleaning up my hundreds of open browser tabs I realised I should also have pointed to this recent arXival by Si et al. as another example of hierarchical Bayesian shrinkage for Gaia data.

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

3 Responses to A convexified mixture model …

  1. Tom Loredo says:

    Re: Building a density via a mixture on a grid or lattice, this is reminiscent of a class of models dubbed *process convolutions* by David Higdon. See, e.g.,

    A process-convolution approach to modelling temperatures in the North Atlantic Ocean (1998)
    https://link.springer.com/article/10.1023/A:1009666805688

    Space and Space-Time Modeling using Process Convolutions (2002)
    https://link.springer.com/chapter/10.1007/978-1-4471-0657-9_2

    Higdon allows the shape parameters to vary along the grid (at least potentially) to build spatial and spatio-temporal models with nonstationarity.

    Readers familiar with moving average models or Levy process mixtures will recognize the basic idea, and directions for extensions that have been pursued since these 15+ year old papers.

    • I’m currently travelling (Seattle & San Fran) so I can’t get beyond the paywall (or stretch my mind properly to complex ideas). But it makes sense to me to construct non-stationary GPs this way; certainly I’m familiar with integrals over white noise giving Gaussian distributed random variables. One question we’re dealing with (in malaria mapping) at the moment is how to decide whether to put more effort into improved modelling of the mean function or more effort into using non-stationary GPs, since they both tend to help predictive performance in a similar way.

  2. Tom Loredo says:

    Also, on the topic of shrinkage, readers who like to learn by doing may want to have a look at the Jupyter notebook I ran through at AAS 227, as part of a “Lectures in Astrostatistics” session organized by Aneta Siemiginowska and Vinay Kashyap (http://hea-www.harvard.edu/AstroStat/aas227_2016/):

    tloredo/AAS227-MLM_Example
    Example of PyStan use for AAS227 astrostatistics splinter session
    https://github.com/tloredo/AAS227-MLM_Example

    After leading you through using PyStan to implement a Bayesian treatment of log-N/log-S or “number counts” fitting (to a gamma distribution, aka a normalized version of the Schecter function), the final plot illustrates shrinkage of the resulting point estimates (in lingo more familiar to astronomers, an adaptive version of correction for “Eddington bias”).

    From the README:
    This repo contains a Jupyter notebook and some support files, demonstrating use of the Stan probabilistic programming language to implement a basic Bayesian multilevel model, the gamma-Poisson model, in a mock astronomical setting. Clone it and run the notebook, which is self-documenting. The HTML file provides a rendered version of the executed notebook.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s