MCMC Settings
Overview
Markov Chain Monte Carlo (MCMC) is the computational engine that powers Bayesian inference in MixModeler. MCMC settings control how the algorithm explores the posterior distribution to generate samples. Proper configuration ensures accurate, efficient inference with reliable uncertainty quantification.
Core MCMC Concepts
What is MCMC?
MCMC generates samples from the posterior distribution by constructing a Markov chain that converges to the target distribution. Instead of calculating the posterior analytically (often impossible), MCMC explores the parameter space through intelligent random sampling.
Why Multiple Samples?
A single point estimate doesn't capture uncertainty. MCMC generates thousands of samples that collectively represent the full posterior distribution, allowing you to:
Calculate means, medians, and modes
Estimate credible intervals
Compute probabilities of any hypothesis
Assess parameter correlations
The Sampling Process
Initialization: Start chains at random or specified starting values
Warm-Up (Tuning): Algorithm adapts step sizes and explores parameter space (samples discarded)
Sampling: After warm-up, algorithm generates samples from the posterior (samples retained)
Convergence: Multiple chains should converge to the same distribution
MCMC Parameters
Chains
Number of independent sampling chains run in parallel.
Default: 4 chains
Range: 2-8 chains
How It Works: Each chain starts from a different random position and independently explores the posterior. Multiple chains allow convergence diagnosis by comparing whether independent explorations reach the same distribution.
When to Increase:
Convergence diagnostics show poor R-hat values
Complex models with many parameters
Highly correlated parameters
First-time model exploration
When to Decrease:
Well-behaved simple models
Memory constraints
Quick iterations during model development
Memory Impact: Each additional chain requires more RAM but increases in parallel (not sequentially).
Draws (Samples)
Number of posterior samples to draw per chain after warm-up.
Default: 2,000 draws per chain
Range: 1,000-5,000 draws
Total Samples: Chains × Draws (e.g., 4 chains × 2,000 = 8,000 total samples)
How It Works: More draws mean more precise estimation of posterior distributions and statistics.
When to Increase:
Need high precision for tail probabilities
Calculating small credible intervals (e.g., 99%)
Final production models
Publishing or regulatory requirements
When to Decrease:
Exploratory analysis
Model development and iteration
Quick comparisons
Memory-constrained environments
Time Impact: Doubling draws approximately doubles computation time.
Tune (Warm-Up)
Number of initial samples per chain used for algorithm adaptation, then discarded.
Default: 1,000 tuning steps per chain
Range: 500-2,000 steps
How It Works: During tuning, the algorithm learns optimal step sizes and adapts to the posterior geometry. These samples are practice runs and don't contribute to final estimates.
When to Increase:
Complex posterior geometries
Models with many parameters (20+)
Convergence issues indicated by diagnostics
Highly correlated variables
When to Decrease:
Simple models with good convergence
Well-specified priors that guide sampling
Quick iterations
Best Practice: Tuning should be at least 50% of draws (e.g., 1,000 tune for 2,000 draws).
Target Accept
Acceptance probability target for the sampling algorithm (No-U-Turn Sampler).
Default: 0.95 (95%)
Range: 0.80-0.99
How It Works: Controls the step size of the sampler. Higher values mean smaller, more careful steps that better explore complex geometries but take longer.
When to Increase (0.95 → 0.99):
Divergent transitions occur
Complex posterior geometries
Highly correlated parameters
Need maximum accuracy
When to Decrease (0.95 → 0.85):
Simple, well-behaved models
Need faster sampling
Already achieving good convergence
Trade-off: Higher acceptance = more accurate but slower; lower acceptance = faster but may miss complex features.
Advanced Settings
Random Seed
Seed for random number generation.
Default: 42
Use Case: Ensures reproducibility. The same seed with identical settings will produce identical results.
When to Change: Generate different posterior samples for sensitivity analysis or ensemble methods.
Fast Inference Mode
Uses Variational Inference (SVI) instead of MCMC.
Speed: 10-20x faster than MCMC
Accuracy: Approximate posteriors (usually good approximation)
When to Use:
Rapid model iteration
Initial model exploration
Model comparison across many specifications
Limited computational resources
When Not to Use:
Final production models
Publishing or regulatory reporting
Models with complex posterior geometries
Need exact credible intervals
Activation: Toggle "Fast Inference" checkbox in Advanced Settings before running model.
Recommended Configurations
Quick Exploration
Fast iteration during model development.
Chains
2
Minimum for convergence check
Draws
1,000
Sufficient for initial estimates
Tune
500
Basic adaptation
Target Accept
0.90
Faster sampling
Fast Inference
ON
Maximum speed
Runtime: 30 seconds - 2 minutes
Standard Analysis
Balanced accuracy and speed for most use cases.
Chains
4
Good convergence diagnosis
Draws
2,000
Precise estimates
Tune
1,000
Full adaptation
Target Accept
0.95
High accuracy
Fast Inference
OFF
Full MCMC
Runtime: 2-5 minutes
Production Model
Maximum accuracy for final models.
Chains
6
Robust convergence
Draws
3,000
High precision
Tune
1,500
Extensive adaptation
Target Accept
0.98
Maximum accuracy
Fast Inference
OFF
Full MCMC
Runtime: 5-10 minutes
Complex Model
Challenging models with many variables or correlations.
Chains
8
Robust diagnosis
Draws
2,500
Sufficient samples
Tune
2,000
Extended adaptation
Target Accept
0.99
Careful exploration
Fast Inference
OFF
Full MCMC
Runtime: 8-15 minutes
Memory-Constrained
For limited RAM environments.
Chains
2
Reduce memory
Draws
1,500
Balance precision/memory
Tune
750
Adequate adaptation
Target Accept
0.93
Slightly faster
Fast Inference
ON
Reduce memory footprint
Runtime: 1-3 minutes
Configuring Settings
Access MCMC Settings
In Model Builder, select Bayesian modeling method
Click MCMC Settings or Advanced Settings
Adjust parameters in the configuration panel
Click Save Settings
Preset Configurations
MixModeler provides quick-access presets:
Fast: Quick exploration settings
Standard: Default balanced settings
High Quality: Production model settings
Custom: Manually configure all parameters
Select a preset or start from one and customize.
Settings Persistence
MCMC settings are saved with each model and persist across sessions. When you clone a model, settings are copied to the new model.
Interpreting Runtime
Time Estimates
Approximate runtime for typical MMM models (20-30 variables, 52-104 observations):
Fast Exploration
2
1,000
30-90 seconds
Standard
4
2,000
2-5 minutes
High Quality
6
3,000
5-10 minutes
Complex Model
8
2,500
8-15 minutes
Factors Affecting Speed
Number of Variables: More variables = longer runtime (roughly linear relationship)
Number of Observations: More data points = longer runtime (sub-linear relationship)
Model Complexity: Adstock transformations and priors increase computational cost
Hardware: CPU speed and core count directly impact parallel chain execution
Target Accept: Higher values slow sampling (each step more computationally expensive)
Progress Monitoring
During MCMC sampling, a progress indicator shows:
Current chain being sampled
Percentage complete for each chain
Estimated time remaining
Memory usage
Performance Optimization
Parallel Processing
MCMC chains run in parallel across CPU cores:
4 chains use 4 cores optimally
8 chains benefit from 8+ core processors
More chains than cores still provide parallelization benefits
Memory Management
Total memory usage scales with: Chains × Draws × Variables
Example: 4 chains × 2,000 draws × 30 variables ≈ 240,000 samples in memory
Tips:
Reduce chains or draws if running out of memory
Close other applications during sampling
Use Fast Inference for memory-constrained systems
Upgrade RAM for routinely large models
Stopping Criteria
MCMC continues until all chains complete the specified draws. You cannot stop early without losing all progress for that run.
Best Practice: Start with quick settings, verify convergence, then run longer if needed.
Common Issues
Issue 1: Divergent Transitions
Symptom: Warning messages about divergences during sampling
Cause: Sampler struggled to explore regions of high curvature in posterior
Solutions:
Increase target_accept from 0.95 to 0.98
Increase tuning steps from 1,000 to 1,500+
Check for very strong priors that conflict with data
Re-parameterize model (remove highly correlated variables)
Issue 2: Poor R-hat Values
Symptom: R-hat > 1.05 in convergence diagnostics
Cause: Chains haven't converged to the same distribution
Solutions:
Increase number of draws
Increase tuning steps
Add more chains
Check for model misspecification
Issue 3: Low Effective Sample Size (ESS)
Symptom: ESS < 10% of total samples
Cause: High autocorrelation in samples (inefficient exploration)
Solutions:
Increase draws to compensate
Increase tuning for better step size adaptation
Simplify model if possible
Check for highly correlated parameters
Issue 4: Very Long Runtime
Symptom: Sampling takes much longer than expected
Cause: Complex model or inefficient sampling
Solutions:
Enable Fast Inference mode for initial exploration
Reduce target_accept from 0.95 to 0.90
Check CPU usage (ensure parallel processing working)
Reduce number of variables
Simplify adstock transformations
Issue 5: Out of Memory
Symptom: Browser crashes or memory error during sampling
Cause: Insufficient RAM for chain storage
Solutions:
Reduce number of chains (8 → 4 → 2)
Reduce draws per chain (2,000 → 1,500 → 1,000)
Enable Fast Inference mode (uses less memory)
Close other browser tabs and applications
Use a machine with more RAM for large models
Best Practices
Start Conservative: Begin with standard settings (4 chains, 2,000 draws). Only adjust if diagnostics indicate issues.
Check Diagnostics First: Don't increase settings blindly. Look at R-hat, ESS, and divergences to guide adjustments.
Iterate Efficiently: Use Fast Inference during development, switch to full MCMC for final models.
Document Settings: Record which MCMC settings were used for each model version for reproducibility.
Balance Speed and Accuracy: More samples aren't always better. Once convergence is good, additional samples provide diminishing returns.
Monitor Resources: Watch memory usage and CPU utilization. Optimize settings based on your hardware capabilities.
Test Sensitivity: Run same model with different settings (e.g., 2 chains vs 6 chains) to ensure results are stable.
When to Accept Results
Your MCMC results are reliable when:
✓ R-hat < 1.01 for all parameters (convergence achieved)
✓ ESS > 400 (bulk) and ESS > 400 (tail) for all parameters (sufficient independent samples)
✓ Divergences = 0 or very few (< 1% of samples) (no exploration failures)
✓ Chains overlap in trace plots (visual convergence confirmation)
✓ Posterior distributions are smooth (not jagged or multimodal without reason)
If these criteria aren't met, adjust settings and rerun before interpreting results.
Technical Notes
Algorithm: MixModeler uses the No-U-Turn Sampler (NUTS), a variant of Hamiltonian Monte Carlo that automatically tunes step sizes and trajectory lengths.
Adaptation: During tuning, NUTS learns the mass matrix (inverse covariance of parameters) to efficiently navigate the posterior geometry.
Warm-Up: Tuning samples are critical for efficiency but are discarded from final results. Never reduce tuning below 500 steps.
Target Accept: Controls the acceptance probability of proposals. NUTS dynamically adjusts to achieve this target during tuning.
Effective Sample Size: ESS accounts for autocorrelation. ESS of 400 from 8,000 samples means ~400 independent draws of information.
Next Steps: After configuring MCMC settings, learn about interpreting Credible Intervals from your posterior samples, or dive into Convergence Diagnostics to validate your results.
Last updated