Errors-in-variables, or not …

One can see from this new arXival that errors-in-variables models have not yet become widely known within astronomy yet, though astronomers are trying to find ways to deal with this sort of modelling scenario.  The errors-in-variables regression problem occurs when you want to regress Y against X, which would usually be described as Y_i = f(X) + \epsilon_i with X a precisely measured covariate and f(\cdot) some kind of model taking input X—e.g. linear regression, f(X) = X^\prime \beta, or Gaussian process regression, f \sim \mathrm{GP}_\theta—but unfortunately X is now observed with error, so our model now features an extra layer describing the relationship between the true but latent (hidden) X and the available, noisily-measured \tilde{X}, e.g. \tilde{X}_i = X_i + \xi_i.  If the error term \xi_i is substantial then ignoring it (and fitting the base model) leaves us exposed to model misspecification errors.  A simple Bayesian solution to this problem is to introduce a further layer describing the population distribution of latent X‘s (one that features hyper-parameters allowing shrinkage is a good choice) and then integrate out all the latent variables via posterior simulation (e.g. MCMC).

The astronomers’ approach here is in fact not to bother adding a model: they simply spread the uncertainty in the \tilde{X} out by drawing mock X_i for each data point independently and then take a finely-binned non-parametric estimator.  There are some nice advantages of modelling in this context, even if a semi-parametric functional prior (like a Gaussian process) is decided to be used.  One of these advantages is that you get a ‘structural shrinkage’ of the noisy \tilde{X}‘s towards values that ‘make sense’ given their corresponding Y‘s and the assumed functional form.  There are some challenges to fitting such a model in the case of a Gaussian process EIV regression: without a nugget term there are multiple ‘crossing points’ that a sampler moving the X‘s must negotiate at which the covariance matrix becomes non-invertible (i.e., when X‘s tie).  A nice solution to this is to use a random Fourier feature representation of the GP.

P.s. One of the canonical examples of EIV regression in astronomy is Kelly et al.; that model can be made fancier in a few fun ways: one is to replace the finite mixture of Normals for the population distribution with an infinite mixture model.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s