MCMC Settings

Overview

Markov Chain Monte Carlo (MCMC) is the computational engine that powers Bayesian inference in MixModeler. MCMC settings control how the algorithm explores the posterior distribution to generate samples. Proper configuration ensures accurate, efficient inference with reliable uncertainty quantification.

Core MCMC Concepts

What is MCMC?

MCMC generates samples from the posterior distribution by constructing a Markov chain that converges to the target distribution. Instead of calculating the posterior analytically (often impossible), MCMC explores the parameter space through intelligent random sampling.

Why Multiple Samples?

A single point estimate doesn't capture uncertainty. MCMC generates thousands of samples that collectively represent the full posterior distribution, allowing you to:

  • Calculate means, medians, and modes

  • Estimate credible intervals

  • Compute probabilities of any hypothesis

  • Assess parameter correlations

The Sampling Process

Initialization: Start chains at random or specified starting values

Warm-Up (Tuning): Algorithm adapts step sizes and explores parameter space (samples discarded)

Sampling: After warm-up, algorithm generates samples from the posterior (samples retained)

Convergence: Multiple chains should converge to the same distribution

MCMC Parameters

Chains

Number of independent sampling chains run in parallel.

Default: 4 chains

Range: 2-8 chains

How It Works: Each chain starts from a different random position and independently explores the posterior. Multiple chains allow convergence diagnosis by comparing whether independent explorations reach the same distribution.

When to Increase:

  • Convergence diagnostics show poor R-hat values

  • Complex models with many parameters

  • Highly correlated parameters

  • First-time model exploration

When to Decrease:

  • Well-behaved simple models

  • Memory constraints

  • Quick iterations during model development

Memory Impact: Each additional chain requires more RAM but increases in parallel (not sequentially).

Draws (Samples)

Number of posterior samples to draw per chain after warm-up.

Default: 2,000 draws per chain

Range: 1,000-5,000 draws

Total Samples: Chains × Draws (e.g., 4 chains × 2,000 = 8,000 total samples)

How It Works: More draws mean more precise estimation of posterior distributions and statistics.

When to Increase:

  • Need high precision for tail probabilities

  • Calculating small credible intervals (e.g., 99%)

  • Final production models

  • Publishing or regulatory requirements

When to Decrease:

  • Exploratory analysis

  • Model development and iteration

  • Quick comparisons

  • Memory-constrained environments

Time Impact: Doubling draws approximately doubles computation time.

Tune (Warm-Up)

Number of initial samples per chain used for algorithm adaptation, then discarded.

Default: 1,000 tuning steps per chain

Range: 500-2,000 steps

How It Works: During tuning, the algorithm learns optimal step sizes and adapts to the posterior geometry. These samples are practice runs and don't contribute to final estimates.

When to Increase:

  • Complex posterior geometries

  • Models with many parameters (20+)

  • Convergence issues indicated by diagnostics

  • Highly correlated variables

When to Decrease:

  • Simple models with good convergence

  • Well-specified priors that guide sampling

  • Quick iterations

Best Practice: Tuning should be at least 50% of draws (e.g., 1,000 tune for 2,000 draws).

Target Accept

Acceptance probability target for the sampling algorithm (No-U-Turn Sampler).

Default: 0.95 (95%)

Range: 0.80-0.99

How It Works: Controls the step size of the sampler. Higher values mean smaller, more careful steps that better explore complex geometries but take longer.

When to Increase (0.95 → 0.99):

  • Divergent transitions occur

  • Complex posterior geometries

  • Highly correlated parameters

  • Need maximum accuracy

When to Decrease (0.95 → 0.85):

  • Simple, well-behaved models

  • Need faster sampling

  • Already achieving good convergence

Trade-off: Higher acceptance = more accurate but slower; lower acceptance = faster but may miss complex features.

Advanced Settings

Random Seed

Seed for random number generation.

Default: 42

Use Case: Ensures reproducibility. The same seed with identical settings will produce identical results.

When to Change: Generate different posterior samples for sensitivity analysis or ensemble methods.

Fast Inference Mode

Uses Variational Inference (SVI) instead of MCMC.

Speed: 10-20x faster than MCMC

Accuracy: Approximate posteriors (usually good approximation)

When to Use:

  • Rapid model iteration

  • Initial model exploration

  • Model comparison across many specifications

  • Limited computational resources

When Not to Use:

  • Final production models

  • Publishing or regulatory reporting

  • Models with complex posterior geometries

  • Need exact credible intervals

Activation: Toggle "Fast Inference" checkbox in Advanced Settings before running model.

Quick Exploration

Fast iteration during model development.

Setting
Value
Rationale

Chains

2

Minimum for convergence check

Draws

1,000

Sufficient for initial estimates

Tune

500

Basic adaptation

Target Accept

0.90

Faster sampling

Fast Inference

ON

Maximum speed

Runtime: 30 seconds - 2 minutes

Standard Analysis

Balanced accuracy and speed for most use cases.

Setting
Value
Rationale

Chains

4

Good convergence diagnosis

Draws

2,000

Precise estimates

Tune

1,000

Full adaptation

Target Accept

0.95

High accuracy

Fast Inference

OFF

Full MCMC

Runtime: 2-5 minutes

Production Model

Maximum accuracy for final models.

Setting
Value
Rationale

Chains

6

Robust convergence

Draws

3,000

High precision

Tune

1,500

Extensive adaptation

Target Accept

0.98

Maximum accuracy

Fast Inference

OFF

Full MCMC

Runtime: 5-10 minutes

Complex Model

Challenging models with many variables or correlations.

Setting
Value
Rationale

Chains

8

Robust diagnosis

Draws

2,500

Sufficient samples

Tune

2,000

Extended adaptation

Target Accept

0.99

Careful exploration

Fast Inference

OFF

Full MCMC

Runtime: 8-15 minutes

Memory-Constrained

For limited RAM environments.

Setting
Value
Rationale

Chains

2

Reduce memory

Draws

1,500

Balance precision/memory

Tune

750

Adequate adaptation

Target Accept

0.93

Slightly faster

Fast Inference

ON

Reduce memory footprint

Runtime: 1-3 minutes

Configuring Settings

Access MCMC Settings

  1. In Model Builder, select Bayesian modeling method

  2. Click MCMC Settings or Advanced Settings

  3. Adjust parameters in the configuration panel

  4. Click Save Settings

Preset Configurations

MixModeler provides quick-access presets:

  • Fast: Quick exploration settings

  • Standard: Default balanced settings

  • High Quality: Production model settings

  • Custom: Manually configure all parameters

Select a preset or start from one and customize.

Settings Persistence

MCMC settings are saved with each model and persist across sessions. When you clone a model, settings are copied to the new model.

Interpreting Runtime

Time Estimates

Approximate runtime for typical MMM models (20-30 variables, 52-104 observations):

Configuration
Chains
Draws
Approximate Time

Fast Exploration

2

1,000

30-90 seconds

Standard

4

2,000

2-5 minutes

High Quality

6

3,000

5-10 minutes

Complex Model

8

2,500

8-15 minutes

Factors Affecting Speed

Number of Variables: More variables = longer runtime (roughly linear relationship)

Number of Observations: More data points = longer runtime (sub-linear relationship)

Model Complexity: Adstock transformations and priors increase computational cost

Hardware: CPU speed and core count directly impact parallel chain execution

Target Accept: Higher values slow sampling (each step more computationally expensive)

Progress Monitoring

During MCMC sampling, a progress indicator shows:

  • Current chain being sampled

  • Percentage complete for each chain

  • Estimated time remaining

  • Memory usage

Performance Optimization

Parallel Processing

MCMC chains run in parallel across CPU cores:

  • 4 chains use 4 cores optimally

  • 8 chains benefit from 8+ core processors

  • More chains than cores still provide parallelization benefits

Memory Management

Total memory usage scales with: Chains × Draws × Variables

Example: 4 chains × 2,000 draws × 30 variables ≈ 240,000 samples in memory

Tips:

  • Reduce chains or draws if running out of memory

  • Close other applications during sampling

  • Use Fast Inference for memory-constrained systems

  • Upgrade RAM for routinely large models

Stopping Criteria

MCMC continues until all chains complete the specified draws. You cannot stop early without losing all progress for that run.

Best Practice: Start with quick settings, verify convergence, then run longer if needed.

Common Issues

Issue 1: Divergent Transitions

Symptom: Warning messages about divergences during sampling

Cause: Sampler struggled to explore regions of high curvature in posterior

Solutions:

  1. Increase target_accept from 0.95 to 0.98

  2. Increase tuning steps from 1,000 to 1,500+

  3. Check for very strong priors that conflict with data

  4. Re-parameterize model (remove highly correlated variables)

Issue 2: Poor R-hat Values

Symptom: R-hat > 1.05 in convergence diagnostics

Cause: Chains haven't converged to the same distribution

Solutions:

  1. Increase number of draws

  2. Increase tuning steps

  3. Add more chains

  4. Check for model misspecification

Issue 3: Low Effective Sample Size (ESS)

Symptom: ESS < 10% of total samples

Cause: High autocorrelation in samples (inefficient exploration)

Solutions:

  1. Increase draws to compensate

  2. Increase tuning for better step size adaptation

  3. Simplify model if possible

  4. Check for highly correlated parameters

Issue 4: Very Long Runtime

Symptom: Sampling takes much longer than expected

Cause: Complex model or inefficient sampling

Solutions:

  1. Enable Fast Inference mode for initial exploration

  2. Reduce target_accept from 0.95 to 0.90

  3. Check CPU usage (ensure parallel processing working)

  4. Reduce number of variables

  5. Simplify adstock transformations

Issue 5: Out of Memory

Symptom: Browser crashes or memory error during sampling

Cause: Insufficient RAM for chain storage

Solutions:

  1. Reduce number of chains (8 → 4 → 2)

  2. Reduce draws per chain (2,000 → 1,500 → 1,000)

  3. Enable Fast Inference mode (uses less memory)

  4. Close other browser tabs and applications

  5. Use a machine with more RAM for large models

Best Practices

Start Conservative: Begin with standard settings (4 chains, 2,000 draws). Only adjust if diagnostics indicate issues.

Check Diagnostics First: Don't increase settings blindly. Look at R-hat, ESS, and divergences to guide adjustments.

Iterate Efficiently: Use Fast Inference during development, switch to full MCMC for final models.

Document Settings: Record which MCMC settings were used for each model version for reproducibility.

Balance Speed and Accuracy: More samples aren't always better. Once convergence is good, additional samples provide diminishing returns.

Monitor Resources: Watch memory usage and CPU utilization. Optimize settings based on your hardware capabilities.

Test Sensitivity: Run same model with different settings (e.g., 2 chains vs 6 chains) to ensure results are stable.

When to Accept Results

Your MCMC results are reliable when:

R-hat < 1.01 for all parameters (convergence achieved)

ESS > 400 (bulk) and ESS > 400 (tail) for all parameters (sufficient independent samples)

Divergences = 0 or very few (< 1% of samples) (no exploration failures)

Chains overlap in trace plots (visual convergence confirmation)

Posterior distributions are smooth (not jagged or multimodal without reason)

If these criteria aren't met, adjust settings and rerun before interpreting results.

Technical Notes

Algorithm: MixModeler uses the No-U-Turn Sampler (NUTS), a variant of Hamiltonian Monte Carlo that automatically tunes step sizes and trajectory lengths.

Adaptation: During tuning, NUTS learns the mass matrix (inverse covariance of parameters) to efficiently navigate the posterior geometry.

Warm-Up: Tuning samples are critical for efficiency but are discarded from final results. Never reduce tuning below 500 steps.

Target Accept: Controls the acceptance probability of proposals. NUTS dynamically adjusts to achieve this target during tuning.

Effective Sample Size: ESS accounts for autocorrelation. ESS of 400 from 8,000 samples means ~400 independent draws of information.


Next Steps: After configuring MCMC settings, learn about interpreting Credible Intervalsarrow-up-right from your posterior samples, or dive into Convergence Diagnosticsarrow-up-right to validate your results.

Last updated