MCMC Settings

Overview

Markov Chain Monte Carlo (MCMC) is the computational engine that powers Bayesian inference in MixModeler. MCMC settings control how the algorithm explores the posterior distribution to generate samples. Proper configuration ensures accurate, efficient inference with reliable uncertainty quantification.

Core MCMC Concepts

What is MCMC?

MCMC generates samples from the posterior distribution by constructing a Markov chain that converges to the target distribution. Instead of calculating the posterior analytically (often impossible), MCMC explores the parameter space through intelligent random sampling.

Why Multiple Samples?

A single point estimate doesn't capture uncertainty. MCMC generates thousands of samples that collectively represent the full posterior distribution, allowing you to:

Calculate means, medians, and modes
Estimate credible intervals
Compute probabilities of any hypothesis
Assess parameter correlations

The Sampling Process

Initialization: Start chains at random or specified starting values

Warm-Up (Tuning): Algorithm adapts step sizes and explores parameter space (samples discarded)

Sampling: After warm-up, algorithm generates samples from the posterior (samples retained)

Convergence: Multiple chains should converge to the same distribution

MCMC Parameters

Chains

Number of independent sampling chains run in parallel.

Default: 4 chains

Range: 2-8 chains

How It Works: Each chain starts from a different random position and independently explores the posterior. Multiple chains allow convergence diagnosis by comparing whether independent explorations reach the same distribution.

When to Increase:

Convergence diagnostics show poor R-hat values
Complex models with many parameters
Highly correlated parameters
First-time model exploration

When to Decrease:

Well-behaved simple models
Memory constraints
Quick iterations during model development

Memory Impact: Each additional chain requires more RAM but increases in parallel (not sequentially).

Draws (Samples)

Number of posterior samples to draw per chain after warm-up.

Default: 2,000 draws per chain

Range: 1,000-5,000 draws

Total Samples: Chains × Draws (e.g., 4 chains × 2,000 = 8,000 total samples)

How It Works: More draws mean more precise estimation of posterior distributions and statistics.

When to Increase:

Need high precision for tail probabilities
Calculating small credible intervals (e.g., 99%)
Final production models
Publishing or regulatory requirements

When to Decrease:

Exploratory analysis
Model development and iteration
Quick comparisons
Memory-constrained environments

Time Impact: Doubling draws approximately doubles computation time.

Tune (Warm-Up)

Number of initial samples per chain used for algorithm adaptation, then discarded.

Default: 1,000 tuning steps per chain

Range: 500-2,000 steps

How It Works: During tuning, the algorithm learns optimal step sizes and adapts to the posterior geometry. These samples are practice runs and don't contribute to final estimates.

When to Increase:

Complex posterior geometries
Models with many parameters (20+)
Convergence issues indicated by diagnostics
Highly correlated variables

When to Decrease:

Simple models with good convergence
Well-specified priors that guide sampling
Quick iterations

Best Practice: Tuning should be at least 50% of draws (e.g., 1,000 tune for 2,000 draws).

Target Accept

Acceptance probability target for the sampling algorithm (No-U-Turn Sampler).

Default: 0.95 (95%)

Range: 0.80-0.99

How It Works: Controls the step size of the sampler. Higher values mean smaller, more careful steps that better explore complex geometries but take longer.

When to Increase (0.95 → 0.99):

Divergent transitions occur
Complex posterior geometries
Highly correlated parameters
Need maximum accuracy

When to Decrease (0.95 → 0.85):

Simple, well-behaved models
Need faster sampling
Already achieving good convergence

Trade-off: Higher acceptance = more accurate but slower; lower acceptance = faster but may miss complex features.

Advanced Settings

Random Seed

Seed for random number generation.

Default: 42

Use Case: Ensures reproducibility. The same seed with identical settings will produce identical results.

When to Change: Generate different posterior samples for sensitivity analysis or ensemble methods.

Fast Inference Mode

Uses Variational Inference (SVI) instead of MCMC.

Speed: 10-20x faster than MCMC

Accuracy: Approximate posteriors (usually good approximation)

When to Use:

Rapid model iteration
Initial model exploration
Model comparison across many specifications
Limited computational resources

When Not to Use:

Final production models
Publishing or regulatory reporting
Models with complex posterior geometries
Need exact credible intervals

Activation: Toggle "Fast Inference" checkbox in Advanced Settings before running model.

Recommended Configurations

Quick Exploration

Fast iteration during model development.

Setting

Value

Rationale

Chains

Minimum for convergence check

Draws

1,000

Sufficient for initial estimates

Tune

500

Basic adaptation

Target Accept

0.90

Faster sampling

Fast Inference

Maximum speed

Runtime: 30 seconds - 2 minutes

Standard Analysis

Balanced accuracy and speed for most use cases.

Setting

Value

Rationale

Chains

Good convergence diagnosis

Draws

2,000

Precise estimates

Tune

1,000

Full adaptation

Target Accept

0.95

High accuracy

Fast Inference

OFF

Full MCMC

Runtime: 2-5 minutes

Production Model

Maximum accuracy for final models.

Setting

Value

Rationale

Chains

Robust convergence

Draws

3,000

High precision

Tune

1,500

Extensive adaptation

Target Accept

0.98

Maximum accuracy

Fast Inference

OFF

Full MCMC

Runtime: 5-10 minutes

Complex Model

Challenging models with many variables or correlations.

Setting

Value

Rationale

Chains

Robust diagnosis

Draws

2,500

Sufficient samples

Tune

2,000

Extended adaptation

Target Accept

0.99

Careful exploration

Fast Inference

OFF

Full MCMC

Runtime: 8-15 minutes

Memory-Constrained

For limited RAM environments.

Setting

Value

Rationale

Chains

Reduce memory

Draws

1,500

Balance precision/memory

Tune

750

Adequate adaptation

Target Accept

0.93

Slightly faster

Fast Inference

Reduce memory footprint

Runtime: 1-3 minutes

Configuring Settings

Access MCMC Settings

In Model Builder, select Bayesian modeling method
Click MCMC Settings or Advanced Settings
Adjust parameters in the configuration panel
Click Save Settings

Preset Configurations

MixModeler provides quick-access presets:

Fast: Quick exploration settings
Standard: Default balanced settings
High Quality: Production model settings
Custom: Manually configure all parameters

Select a preset or start from one and customize.

Settings Persistence

MCMC settings are saved with each model and persist across sessions. When you clone a model, settings are copied to the new model.

Interpreting Runtime

Time Estimates

Approximate runtime for typical MMM models (20-30 variables, 52-104 observations):

Configuration

Chains

Draws

Approximate Time

Fast Exploration

1,000

30-90 seconds

Standard

2,000

2-5 minutes

High Quality

3,000

5-10 minutes

Complex Model

2,500

8-15 minutes

Factors Affecting Speed

Number of Variables: More variables = longer runtime (roughly linear relationship)

Number of Observations: More data points = longer runtime (sub-linear relationship)

Model Complexity: Adstock transformations and priors increase computational cost

Hardware: CPU speed and core count directly impact parallel chain execution

Target Accept: Higher values slow sampling (each step more computationally expensive)

Progress Monitoring

During MCMC sampling, a progress indicator shows:

Current chain being sampled
Percentage complete for each chain
Estimated time remaining
Memory usage

Performance Optimization

Parallel Processing

MCMC chains run in parallel across CPU cores:

4 chains use 4 cores optimally
8 chains benefit from 8+ core processors
More chains than cores still provide parallelization benefits

Memory Management

Total memory usage scales with: Chains × Draws × Variables

Example: 4 chains × 2,000 draws × 30 variables ≈ 240,000 samples in memory

Tips:

Reduce chains or draws if running out of memory
Close other applications during sampling
Use Fast Inference for memory-constrained systems
Upgrade RAM for routinely large models

Stopping Criteria

MCMC continues until all chains complete the specified draws. You cannot stop early without losing all progress for that run.

Best Practice: Start with quick settings, verify convergence, then run longer if needed.

Common Issues

Issue 1: Divergent Transitions

Symptom: Warning messages about divergences during sampling

Cause: Sampler struggled to explore regions of high curvature in posterior

Solutions:

Increase target_accept from 0.95 to 0.98
Increase tuning steps from 1,000 to 1,500+
Check for very strong priors that conflict with data
Re-parameterize model (remove highly correlated variables)

Issue 2: Poor R-hat Values

Symptom: R-hat > 1.05 in convergence diagnostics

Cause: Chains haven't converged to the same distribution

Solutions:

Increase number of draws
Increase tuning steps
Add more chains
Check for model misspecification

Issue 3: Low Effective Sample Size (ESS)

Symptom: ESS < 10% of total samples

Cause: High autocorrelation in samples (inefficient exploration)

Solutions:

Increase draws to compensate
Increase tuning for better step size adaptation
Simplify model if possible
Check for highly correlated parameters

Issue 4: Very Long Runtime

Symptom: Sampling takes much longer than expected

Cause: Complex model or inefficient sampling

Solutions:

Enable Fast Inference mode for initial exploration
Reduce target_accept from 0.95 to 0.90
Check CPU usage (ensure parallel processing working)
Reduce number of variables
Simplify adstock transformations

Issue 5: Out of Memory

Symptom: Browser crashes or memory error during sampling

Cause: Insufficient RAM for chain storage

Solutions:

Reduce number of chains (8 → 4 → 2)
Reduce draws per chain (2,000 → 1,500 → 1,000)
Enable Fast Inference mode (uses less memory)
Close other browser tabs and applications
Use a machine with more RAM for large models

Best Practices

Start Conservative: Begin with standard settings (4 chains, 2,000 draws). Only adjust if diagnostics indicate issues.

Check Diagnostics First: Don't increase settings blindly. Look at R-hat, ESS, and divergences to guide adjustments.

Iterate Efficiently: Use Fast Inference during development, switch to full MCMC for final models.

Document Settings: Record which MCMC settings were used for each model version for reproducibility.

Balance Speed and Accuracy: More samples aren't always better. Once convergence is good, additional samples provide diminishing returns.

Monitor Resources: Watch memory usage and CPU utilization. Optimize settings based on your hardware capabilities.

Test Sensitivity: Run same model with different settings (e.g., 2 chains vs 6 chains) to ensure results are stable.

When to Accept Results

Your MCMC results are reliable when:

✓ R-hat < 1.01 for all parameters (convergence achieved)

✓ ESS > 400 (bulk) and ESS > 400 (tail) for all parameters (sufficient independent samples)

✓ Divergences = 0 or very few (< 1% of samples) (no exploration failures)

✓ Chains overlap in trace plots (visual convergence confirmation)

✓ Posterior distributions are smooth (not jagged or multimodal without reason)

If these criteria aren't met, adjust settings and rerun before interpreting results.

Technical Notes

Algorithm: MixModeler uses the No-U-Turn Sampler (NUTS), a variant of Hamiltonian Monte Carlo that automatically tunes step sizes and trajectory lengths.

Adaptation: During tuning, NUTS learns the mass matrix (inverse covariance of parameters) to efficiently navigate the posterior geometry.

Warm-Up: Tuning samples are critical for efficiency but are discarded from final results. Never reduce tuning below 500 steps.

Target Accept: Controls the acceptance probability of proposals. NUTS dynamically adjusts to achieve this target during tuning.

Effective Sample Size: ESS accounts for autocorrelation. ESS of 400 from 8,000 samples means ~400 independent draws of information.

Next Steps: After configuring MCMC settings, learn about interpreting Credible Intervals from your posterior samples, or dive into Convergence Diagnostics to validate your results.

PreviousPrior Configuration NextCredible Intervals

Last updated 4 months ago

Good morning

hashtagOverview

hashtagCore MCMC Concepts

hashtagWhat is MCMC?

hashtagWhy Multiple Samples?

hashtagThe Sampling Process

hashtagMCMC Parameters

hashtagChains

hashtagDraws (Samples)

hashtagTune (Warm-Up)

hashtagTarget Accept

hashtagAdvanced Settings

hashtagRandom Seed

hashtagFast Inference Mode

hashtagRecommended Configurations

hashtagQuick Exploration

hashtagStandard Analysis

hashtagProduction Model

hashtagComplex Model

hashtagMemory-Constrained

hashtagConfiguring Settings

hashtagAccess MCMC Settings

hashtagPreset Configurations

hashtagSettings Persistence

hashtagInterpreting Runtime

hashtagTime Estimates

hashtagFactors Affecting Speed

hashtagProgress Monitoring

hashtagPerformance Optimization

hashtagParallel Processing

hashtagMemory Management

hashtagStopping Criteria

hashtagCommon Issues

hashtagIssue 1: Divergent Transitions

hashtagIssue 2: Poor R-hat Values

hashtagIssue 3: Low Effective Sample Size (ESS)

hashtagIssue 4: Very Long Runtime

hashtagIssue 5: Out of Memory

hashtagBest Practices

hashtagWhen to Accept Results

hashtagTechnical Notes

Overview

Core MCMC Concepts

What is MCMC?

Why Multiple Samples?

The Sampling Process

MCMC Parameters

Chains

Draws (Samples)

Tune (Warm-Up)

Target Accept

Advanced Settings

Random Seed

Fast Inference Mode

Recommended Configurations

Quick Exploration

Standard Analysis

Production Model

Complex Model

Memory-Constrained

Configuring Settings

Access MCMC Settings

Preset Configurations

Settings Persistence

Interpreting Runtime

Time Estimates

Factors Affecting Speed

Progress Monitoring

Performance Optimization

Parallel Processing

Memory Management

Stopping Criteria

Common Issues

Issue 1: Divergent Transitions

Issue 2: Poor R-hat Values

Issue 3: Low Effective Sample Size (ESS)

Issue 4: Very Long Runtime

Issue 5: Out of Memory

Best Practices

When to Accept Results

Technical Notes