Today I read the new arXival by Leistedt et al. on the topic of photometric redshifting, in which a hierarchical model is proposed to include learning something of the inherent distribution of galaxies in redshift-magnitude-template space from the data itself: constraining the hyper-parameters of a distribution model and tightening the posteriors on each individual galaxy’s inferred redshift through Bayesian shrinkage.

I link to David van Dyk’s talk on this topic at the IAUS306 since afterwards I wrote to Narciso Benitez & David: “*I’m not sure if you’ve talked to each other at the meeting yet, but when I was watching Narciso’s presentation on the importance of the prior for photometric redshift estimation I thought it seemed a very natural place to apply the Bayesian shrinkage ideas that David presented. I.e., allow the data itself to improve the prior for both the redshift distribution and the proportion of each template type in the survey under study using a simple hierarchical model with some hyper parameters controlling the shape of p(z) and the probability of each template!*“. Though I do not believe they did end up collaborating …

All that to say, I think this is a good idea. With regards to specifics I think the proposed model in this particular paper is a little simplistic. For instance, the intrinsic distribution is formed by (arbitrarily) dividing the redshift-magnitude-template space into rectangular prisms across which a constant contribution to the prior mass is assigned by the Dirichlet. Alternatively one could avoid arbitrary binning by supposing a distribution on the partitioning (a la methods for treed Gaussian processes; Gramacy & Lee or Kim & Holmes) and then marginalising over this unknown distribution (which, in the treed GP case, does tend to produce smooth marginals despite the inherently discontinuous nature of the sub-models). Moreover, a stronger prior of correlation between bins could well be motivated by the adjacency of the prisms in the physical space. Another minor problem with the proposed implementation seems to be that since the likelihood stated does not depend on the magnitude each galaxy is presumably being assigned to a bin solely according to its (noisily) observed flux in the reference band.

On a more general level I wonder at the value of the inferred intrinsic distribution as compared to the posterior draws of the true photometric redshift lists themselves. In this type of ‘deconvolution’ problem (e.g. mixture models) the posterior for the intrinsic distribution is typically highly prior/model sensitive (see, e.g., mixture model examples). If one reports both the posterior samples of the true redshifts along with their prior densities it makes it easy for others to come along with their own prior/models for these latent variables and apply a simple reweighting to approximate their own posterior (as in my mixture modelling paper). Ultimately for comparison to cosmological models one would like a prior tuned (e.g.) for consistency with simulations of mock galaxies from that family of models, I would imagine. Likewise, one can compute whatever summary statistics concerning the redshift distribution that one might be interested in as functionals over the true redshift samples.

Edit: I forgot to mention that I think the toy simulated dataset example should be augmented with a test of performance (in RMSE) on one of the photometric redshifting challenge datasets.

I agree strongly on the publish-your-priors point; we are huge fans of re-weighting posterior samples, as you know. Indeed, some of the probabilistic redshifts that have been released to date have been released *without* priors specified and are therefore (for most purposes) not useful.

Hi Ewan, thanks for your interesting comments! I’d like to clarify a few things:

The goal of this work is to demonstrate how existing infrastructure – here a template fitting method – can be simply expanded into a hierarchical model that allows one to infer the underlying distributions of types and redshifts. These are often estimated using incorrect methods that yield no estimates of the uncertainties whatsoever. Our method is a straightforward way to correctly do this and propagate redshift uncertainties into cosmological likelihoods/analyses.

In practice our binned/histogram model is rathe specific *but* it is flexible enough to represent real redshift distributions of interest. Adopting a more generic representation would yield smoother distributions, which is nice but by no means required. With a few dozen bins a histogram representation can represent real redshift distributions and correctly all capture inter-bin correlations, which is all we need to propagate redshift uncertainties into a cosmological analysis.

Of course if one wanted to encode stronger priors on the distributions and their smoothness then of course a different approach could be adopted. But again, this can be achieved with our model (using a Generalised Dirichlet and fine binning). And in general it is difficult to come up with good physical priors for the smoothness and shape of the redshift distributions without having to model the galaxy survey of interest in great detail.

Your criticism of the likelihood (which indeed does not depend on the magnitude) is justified but again we simply adopted an existing template-fitting method/likelihood function. Our point is that our model can be interfaced with any flux-redshift likelihood. This one (from BPZ) is by no means ideal but is very widespread.

Finally, I totally agree that reporting prior and posterior samples is crucial and should be the baseline for all methods. But eventually cosmological analyses require redshift distributions. Our method is probably the simplest way to extend existing template-fitting methods and interface them with existing cosmological likelihood to correctly sample redshift distributions.

Hiranya, Daniel and I would be happy to discuss this in further detail!

Boris