A bold proclamation by Pflamm-Altenburg et al. today (link) that the theory of pure stochastic star formation for the initial cluster (stellar) mass function (ICMF) is ruled out by the Sharma et al. (2011) mass – radius dataset.
The ICMF is thought to be something like a power-law distribution with slope beta=2 (i.e., p(m) proportional to M^-beta). For beta>=2 we should take it as implicit that there exists some lower mass threshold to this distribution such that the density is normalizable; however, for practical purposes we can only observe down to a certain lower mass limit due to selection effects and so on. While an upper limit is unnecessary from such normalization arguments there seems to be an assumption in the literature that one exists (physics, I guess), and the major debate (at least in this paper) therefore centers on whether or not this upper limit is the same in all environments. PA et al. aim to demonstrate that the upper limit depends on galactocentric distance.
Their first (correct) step is to observe that if one looks at fixed bins of galactocentric distance then the distribution of the i-th most massive star will follow the distribution of the i-th order statistic, and therefore will depend on the number of clusters in the bin. Therefore, if we’re trying to figure out what the mass cut-off is doing by comparing, say, the distributions of i-th ranked stars we’d better choose equal-sized bins. So far so good.
To this end PA et al. examine the masses of each of the 1st, 2nd, …, 5th ranked clusters in a series of galactocentric radius bins of equal size (i.e., equal number of clusters per bin); this is shown in their Fig 3. Looking at this data it does seem like at least the 2nd, …, 5th ranked masses are declining with galactocentric distance, although the 1st ranked (most massive) appears relatively independent of distance. To my mind this would suggest that it’s not necessarily the upper mass limit we’d better be worrying about (contrary to the title of their paper), but rather the beta of the power law that seems to be changing.
To quantify these observations PA et al perform a series of least squares linear fits to the trends of 1st, …, 5th ranked masses against galactocentric distance. This is the second point of departure between my thinking and theirs. If you’re going to look at trends in quantiles you better use quantile regression (cf. anything by Roger Koenker [who’s a thoroughly nice guy, btw]). This avoids binning and gives the appropriate weighting to each datapoint: whereas least squares takes a (x-y)^2 distance, quantile regression takes an |x-y| absolute difference instead.
In the figure below I show the results of a quantile regression applied to the data from JA et al. At each quantile value we fit an intercept and slope and plot the p-value of the slope significance (recovered by bootstrapping in the quantreg package for R). A significant slope has p < 0.05; while at p > 0.1 we’re probably just fitting noise. Reassuringly, given the amount of MC simulations JA et al did to verify their fitting significances we get relatively similar results: there does seem to be a highly significant change in the distribution of cluster masses around the intermediate quantiles (not the upper mass limit) in this dataset.
However, the JA et al. analysis is focussed entirely on what’s going on above the 0.7 quantile (5th most massive star out of 17). The point they neglect is that these results depend most sensitively on the reliability of our mass estimates and completeness of the sample at the low mass end! That is, there’s a lot riding on their assumption that 600 Msun is an appropriate completeness limit for this sample at all galactocentric distances. If we truncate our sample at slightly higher mass limits of 1000 Msun or 1500 Msun the significance of these results is not nearly as impressive (the lower panels of the above figure). True, part of the loss of significance goes with the reduction in sample size, but not all of it. The point is that with these power law distribution functions completeness and Malmquist-type biases are something to be very concerned with, even when we’re focussing on the distribution of the upper i-th ranked masses.
For this reason I really do wonder if JA et al might one day find themselves rueing the hubris of their couplet: “Pure stochastic star formation is thereby ruled out. We use this example to elucidate how naive analysis of data can lead to unphysical conclusions.” Hmmm...