-
Notifications
You must be signed in to change notification settings - Fork 13
Ensemble sampler algorithm
The ensemble sampler algorithm uses a group of Markov chains to explore the parameter space. Each chain j starts with a point in the parameter space φj,t and is evolved using the complementary chains of the ensemble.
Christen (2006) found that it is possible to evolve the chain φj,t to the state t+1 via a walk move using a complementary chain of the ensemble. Goodman & Weare (2010) used the idea of the walk move to construct an affine invariant move called stretch move. The stretch move for the chain φj,t is defined as
where φk,t is a complementary chain of the ensemble, such that j ≠ k and z is a scaling variable that regulates the step. This scaling variable has to come from a density distribution g with the symmetry condition (Christen 2006)
A distribution that follows this condition is
where a > 1. There is no optimal value for a, but we set a = 2 to be consistent with ensemble sampler algorithms in the literature (e.g., Goodman & Weare 2010, Hou et al., 2012). To ensure the invariant distribution we have to compute the ratio
The term zN-1 ensures detailed balance ( for more details see Goodman & Weare 2010). To decide whether we accept or not the proposed state we use
where U is a random number between [0,1]. After a number N of iterations and L chains, we will have N \times L samples for each parameter from which we can create posterior distributions. Below you can see a general overview of the evolution of the ensemble
The following Figure shows an example of the evolution of six chains of the ensemble sampler algorithm. The chains started at a different point in the parameter space. After a finite number of iterations (in this case a few hundreds), the chains converges to a stable region of the parameter space. Details on how we create marginal posterior distributions from chain's samples are provided below. Another advantage of this approach is that, since each Markov chain evolves independently, this algorithm can be parallelized (see Foreman-Mackey et al., 2013).
Example of the evolution of an ensemble sampler algorithm using six chains. Upper panel: Parameter value for each chain from iteration 0 to 5000 in logarithmic scale. Lower left panel: Chains behavior for the last 4000 iterations. Lower right panel: Histogram created using the information contained in all the chains in the last 4000 iterations.
In order to infer the parameter values based on a MCMC sampling we need to use chains that have converged. A widely used convergence test has been developed by Gelman & Rubin (1992). This test compares the between-chain B and within-chain W variance via the scaled potential factor
where n is the length of each chain. We define convergence as when chains have * R^ < 1.02* for all the parameters.
Chains that have converged to a static solution represent a sample of the marginal posterior distribution from which they were sampled. The frequencies of the chains can be used to create the posterior distribution of the sampled parameters. A common way to draw the sampling frequency is with a histogram, as shown in previous Figure. The final marginal posterior distribution for each parameter is also called credible interval.
The median and the 68% limits of the credible interval are commonly used to define the parameter's best estimate and its uncertainty (see, e.g., Hogg & Foreman-Mackey 2018). When the marginal posterior distribution follows a Gaussian distribution, the median and the 68% limits of the credible interval correspond to the mean and standard deviation of a normal distribution. When the posterior distribution is skewed, the 68% limits are not symmetric with respect to the median, and they give an "first-order" idea of the shape of the marginal posterior distribution that describes a given parameter.
- Start to use pyaneti
- Parallel run
- The input_fit.py file
- Output files
- Parametrizations
- Examples
- Theory
- Others: