I spent last week in Paris visiting the Institut Pasteur for a Gates Foundation workshop organised to focus thinking around the problem of designing a Target Product Profile for a ‘commercial’ malaria serology test. On my journey back I read a recent arXival by Olamaie et al. on the topic of cluster modelling with electron pressure profiles. This one caught my eye because the authors describe the use of Bayesian model selection to determine the number of knots (in a simple interpolating function for the radial profile shape) necessary to support the complexity of the available data. In this case it is imagined that the positions of the knots and the amplitude of the cluster profile at each knot location are free parameters and that the cluster profile is linearly interpolated between knots. While this produces a model with discontinuous gradient I actually don’t mind that aspect since these types of models tend to nevertheless produce smooth posterior aggregates (e.g. posterior median curves; see, e.g., the voronoi tesselation GP model of Kim et al.).

Where I see some issues with the method is in the choice of priors for the knot amplitudes which are uniform and independent of knot position or ordering. An ordering *is* discussed for the radial location of each knot but as far as I can see this doesn’t affect the prior on each knot’s amplitude, and indeed is there simply to avoid an identifiability ‘problem’ (similar to the labelling in mixture models; i.e., this is only a ‘problem’ for some computational methods of marginal likelihood estimation rather than being an inherent flaw in the Bayesian construction). My concern is that the use of these vague uniform priors is inconsistent with our expectations of a broadly decreasing profile; applying these priors to real data will ultimately skew the posterior in model space towards small numbers of knots where signal-to-noise is low. In particular, I would suggest that this construction will produce an overly simplified representation of profiles in both the cluster core and cluster outskirts. To play “Devil’s Advocate” (& based on my experience with Sersic profile fitting) I would suggest that the ordinary parametric forms may well perform better in this type of application—or at least there are no examples in the manuscript to even attempt to convince me the other way.

If I were conducting the analysis I would be interested in models taking some prior information from the available theoretical parametric forms; for example, to build a model from GP deviations about a canonical profile. If such a model was not considered fast enough to fit generally then perhaps at least to use such a model to learn some more refined priors on the knot construction.

Incidentally: [removed]