A recent arXival presents a (the arXival authors’ favourite) family of conditional density estimation (CDE) methods as a system for (A) propagating uncertainties downstream in astronomical analyses and (B) conducting likelihood-free inference in astronomy. While some of the individual methods and tools have merits, I feel that the whole package is grossly over-sold in this context and that the examples presented don’t stand up to scrutiny. There are some applications where this might be the right approach, but there are plenty of applications (including cases for those suggested in the paper) where this will not be the right approach. For that reason it would have been better if the authors had provided a balanced comparison against alternatives and related methods. So, to specifics …

My main gripe is that the authors put a lot of emphasis on the advantages of the (estimator of!) their prefered CDE loss function, but nowhere here examine: (i) how close their estimator of the CDE loss is to the actual loss; (ii) how close their estimated-loss-minimising method is to the true posterior; or (iii) what implications this loss function might have on model building for the purpose of error propagation. [The first two of these are similar concerns to those raised on Xi’an’s ‘og in regards to an earlier paper by some of the authors presenting these ideas.] Instead we have in Example 1 the authors claiming that a goodness-of-fit test can’t distinguish a clearly nonsense model from a good model, but that their CDE loss can—when, in fact, the supposed goodness-of-fit test that fails here is the PIT test, which is not a goodness-of-fit test at all; it is a means to investigate an aspect of Bayesian uncertainty calibration, not really accuracy in location. In Example 2 the authors examine their RFCDE approach as a post-adjustment method for ABC posteriors and establish its supposed superiority according to their estimator of the CDE loss function. But if you visually compare (by inspecting their Figure 3) the RFCDE posterior approximation against the raw ABC posterior it is clear that their method’s contribution is to reform the ABC posterior so as to place vanishingly small amounts of posterior mass on the ground truth line of degeneracy (along which the generated data are indistinguishable) in some places (e.g. for all of ). Is that really an improvement?

With regards to the topic of error propagation, I’m not sure that approaches adding kernel density estimators actually improve on point cloud based posterior summaries or other representations (such as clever parametric summaries). A previous application of random forests to reconstruct the Bayesian posterior as a point cloud (the first step of RFCDE before the final KDE is added) in an astronomical context can be seen here. (Although note that the Figures and Results shown in that paper were not actually created with the exact method referred to in the paper, which is another discussion entirely!) The topic that really needs investigating in this area is how to construct efficient posterior representations that allow for prior-reweighting: such as when individual photo-z posteriors fit under a simple prior might be later reweighted to reflect a model with shrinkage based on spatial clustering. Draws from nested sampling help in this because they naturally include samples from a broader region of parameter space than a purely posterior focussed simulator. But indeed there can be challenges for storing the draws for fits to a large catalog of objects, so maybe we’re back to clever parametric summaries that allow some separation of the prior and likelihood contributions?

Pingback: cdetools package for R: Dalmasso, et al [updated] | hypergeometric