# Why random walk MCMC?

The motivation to try first for a proof of convergence for NS with random walk MCMC is that this is perhaps the simplest non-trivial option; and it is the one exemplified in a number of classic examples (such as the “lighthouse” problem).

I say non-trivial because of course there is one surefire, but trivial, way to get an unbiased draw from $\pi(\cdot)I(L(\cdot)>L^\ast)$ which is to draw from the prior relentlessly until a point is found above the current likelihood limit.  Obviously, the efficiency of this algorithm goes down the drain whenever the posterior mass is marked more concentrated than the prior; i.e., in any interesting Bayesian analysis.

Interestingly, in cosmology a popular technique for (hopefully) drawing from $\pi(\cdot)I(L(\cdot)>L^\ast)$ has been ellipse-based nested sampling (see e.g. Mukherjee et al. 2005; Feroz et al. 2008): i.e., fitting an ellipse (or in >2 dimensions an ellipsoid) around the current set of live particles, expanding it a little to hopefully cover the target support, and then rejection sampling until a point above the current likelihood limit is found.  Unlike the trivial solution noted above this solution should not drastically decrease in efficiency as NS proceeds; however, there is no guarantee that the ellipse used for rejection sampling will cover the target support, hence sampling of the replacement point may be biased with the effect that NS proceeds inwards (to high likelihood regions) too quickly.

A natural fix for this problem—and one which avoids the need to specify priors amenable to a transformation of the parameter space to the unit hypercube—is either independence sampling MCMC or rejection sampling MCMC (covered in Sections 2.3.3 and 2.3.4 of Tierney 1994).  In the first case one might choose a proposal density from a parametric functional form (e.g. [multivariate] Normal or Student’s t) fitted/optimised to the current set of live particles (a la adaptive importance sampling).  In the second case one might take the prior and multiply it by a logistic windowing function taking values ~1 inside an ellipse around the live particles and decreasing away from this ellipse (but bounded below by a $c > 0$; thereby always covering the target support at some non-trivial level); not knowing the normalization of this proposal distribution is not a problem for rejection sampling MCMC.

In any case, since these alternative proposals are quite feasible and not obviously worse than random walk MCMC (perhaps better), I would hope to be able to eventually include these options in a convergence proof once the random walk MCMC one is complete.