OLS vs Bayesian Selection
Understanding and Switching Between Modeling Approaches
OLS vs Bayesian Selection
Understanding and Switching Between Modeling Approaches
Overview
MixModeler supports two statistical approaches: OLS (Ordinary Least Squares) and Bayesian modeling. You can switch between them anytime, and both use the same variables and model structure. The difference lies in how coefficients are estimated and what additional information you get.
Key Concept: Same model specification, different statistical frameworks
The Two Approaches
OLS (Ordinary Least Squares)
Statistical Framework: Frequentist
What it does:
- Estimates single "best" coefficient for each variable 
- Minimizes sum of squared residuals 
- Provides point estimates only 
- Classical regression approach 
Output:
- Coefficient (β) 
- Standard error 
- T-statistic 
- P-value 
- 95% confidence interval 
- R² 
Computation:
- Fast (milliseconds) 
- Deterministic (same result every time) 
- No sampling needed 
When to use:
- Default starting point 
- Exploratory analysis 
- Quick iterations 
- When you don't have prior knowledge 
- Stakeholders expect traditional statistics 
Bayesian Modeling
Statistical Framework: Bayesian
What it does:
- Incorporates prior beliefs about coefficients 
- Uses MCMC sampling to estimate posterior distributions 
- Provides full probability distributions 
- Quantifies uncertainty 
Output:
- Posterior mean (similar to coefficient) 
- Posterior standard deviation 
- 95% credible interval 
- Full posterior distribution 
- R-hat (convergence diagnostic) 
- Effective sample size 
Computation:
- Slower (seconds to minutes) 
- Stochastic (small variations each run) 
- Requires MCMC sampling 
When to use:
- You have expert priors 
- Need uncertainty quantification 
- Want probabilistic statements 
- Final production models 
- Stakeholders understand Bayesian inference 
Switching Between OLS and Bayesian
How to Switch
In Model Library:
- Locate model in table 
- Click the Type button (shows "OLS" or "BAY") 
- Type toggles immediately 
- All pages update to reflect new type 
In Model Builder:
- Use the model type toggle at top 
- Interface updates immediately 
- Statistics shown change based on type 
Effect of switching:
- Model specification unchanged (same variables) 
- Results recalculated in new framework 
- Interface adapts to show relevant statistics 
- All other pages (Diagnostics, Decomposition) use selected type 
What Changes When You Switch
Interface changes:
OLS mode shows:
- Coefficient Type (Floating/Fixed) 
- Coefficient Value input 
- Standard error 
- T-statistic 
- P-value 
Bayesian mode shows:
- Prior Distribution dropdown 
- Prior Mean input 
- Prior Std Dev input 
- Posterior Mean 
- Posterior Std Dev 
Statistics reported:
OLS:
- Point estimates 
- Frequentist confidence intervals 
- P-values 
- F-statistics 
Bayesian:
- Posterior distributions 
- Credible intervals 
- Posterior probabilities 
- WAIC, LOO (model comparison metrics) 
OLS Mode Details
Coefficient Estimation
How it works:
- Minimizes sum of squared errors 
- Solves normal equations 
- Returns single best estimate per variable 
Assumptions:
- Linear relationship 
- Normally distributed errors 
- Homoscedastic errors 
- Independent observations 
- No perfect multicollinearity 
Advantages:
- Fast computation 
- Familiar to stakeholders 
- Standard in econometrics 
- Easy interpretation 
Limitations:
- No uncertainty quantification beyond std error 
- Sensitive to outliers 
- Can't incorporate prior knowledge 
- Point estimates only 
Fixed vs Floating Coefficients
Floating (Default):
- Coefficient estimated by regression 
- Model determines best value 
- Normal use case 
Fixed:
- You specify exact coefficient value 
- Useful for sensitivity analysis 
- Advanced feature 
- Example: "What if TV coefficient was exactly 1000?" 
How to use:
- Select "Fixed" in Coefficient Type 
- Enter desired value 
- Add variable 
- Regression estimates others, given fixed value 
Interpreting OLS Results
Coefficient:
- Units: Change in KPI per unit change in variable 
- Sign: Positive (increases KPI) or Negative (decreases KPI) 
- Magnitude: Strength of relationship 
T-statistic:
- Measures how many standard errors coefficient is from zero 
- |t| > 1.96: Significant at 95% confidence 
- |t| > 2.58: Significant at 99% confidence 
- Target: |t| > 2.0 
P-value:
- Probability of observing coefficient if true value is zero 
- < 0.05: Significant 
- < 0.01: Highly significant 
- > 0.10: Not significant, consider removing 
Confidence Interval:
- Range where true coefficient likely falls 
- Narrow interval: Precise estimate 
- Wide interval: Uncertain estimate 
- Excludes zero: Variable is significant 
Bayesian Mode Details
Prior Distributions
What are priors: Your belief about coefficient value BEFORE seeing the data
Why use priors:
- Incorporate expert knowledge 
- Regularize estimates (prevent overfitting) 
- Handle collinearity better 
- More realistic in small samples 
Available distributions:
Normal (Default):
- Symmetric around mean 
- Most common choice 
- Parameters: mean, std dev 
- Use when: No strong directional belief 
Student-t:
- Heavier tails than Normal 
- More robust to outliers 
- Parameters: mean, std dev, degrees of freedom 
- Use when: Expect occasional extreme values 
Laplace (Double Exponential):
- Sharper peak, heavier tails 
- Promotes sparsity 
- Parameters: mean, scale 
- Use when: Some coefficients should be near zero 
Horseshoe:
- Strong sparsity inducing 
- Shrinks small coefficients to zero 
- Keeps large coefficients 
- Use when: Many variables, few truly important 
Uniform:
- All values in range equally likely 
- Non-informative 
- Parameters: lower bound, upper bound 
- Use when: Know range but nothing else 
Half-Normal (Positive Only):
- Only positive values allowed 
- Normal distribution truncated at zero 
- Use when: Coefficient must be positive (e.g., marketing spend effect) 
Exponential (Positive/Negative):
- Decaying probability 
- Favors values near zero 
- Use when: Small effects expected 
Gamma/Inverse Gamma:
- Positive values only 
- Flexible shapes 
- Use when: Positive coefficients, specific shape needed 
Setting Prior Parameters
Prior Mean:
- Your best guess for coefficient value 
- Example: "I think TV coefficient is around 500" 
- Set to 0 if no strong belief 
Prior Std Dev:
- Your uncertainty about the mean 
- Small std dev (e.g., 50): Strong belief, narrow prior 
- Large std dev (e.g., 1000): Weak belief, diffuse prior 
- Very large (e.g., 10000): Nearly non-informative 
Common approaches:
Weakly informative (recommended default):
- Prior mean = 0 
- Prior std dev = 1000 
- Allows data to dominate 
- Mild regularization 
Informative:
- Prior mean = expert estimate 
- Prior std dev = reasonable uncertainty 
- Use when you have strong domain knowledge 
- Example: Mean=500, Std=200 for TV based on previous studies 
Sign constraints:
- Use Half-Normal for positive-only 
- Use Exponential negative for negative-only 
- Prevents nonsensical estimates 
Running Bayesian Inference
Important: Switching to Bayesian mode doesn't automatically run inference
Process:
- Switch model to Bayesian in Model Library 
- Configure priors for variables in Model Builder 
- Navigate to Bayesian Model Interface 
- Click "Run Inference" 
- Wait for MCMC sampling (30 seconds to 5 minutes) 
- Review convergence diagnostics 
- Results now available throughout MixModeler 
MCMC Settings:
- Chains: 4 (default, recommended) 
- Iterations: 2000 (default) 
- Warmup: 1000 (discarded) 
- Thinning: 1 (keep every sample) 
Interpreting Bayesian Results
Posterior Mean:
- Average of posterior distribution 
- Similar interpretation to OLS coefficient 
- "Best estimate" given data and priors 
Posterior Std Dev:
- Uncertainty in coefficient estimate 
- Similar to standard error in OLS 
- Smaller = more certain 
95% Credible Interval:
- Interpretation: "95% probability true coefficient is in this range" 
- Different from confidence interval (frequentist concept) 
- Excludes zero: Strong evidence variable matters 
R-hat (Gelman-Rubin):
- Convergence diagnostic 
- < 1.01: Excellent convergence 
- < 1.05: Acceptable convergence 
- > 1.10: Poor convergence, rerun with more iterations 
Effective Sample Size (ESS):
- Number of independent samples 
- > 1000: Good 
- > 400: Acceptable 
- < 100: Poor, rerun with more iterations 
Posterior Probability:
- P(coefficient > 0) for positive effect 
- P(coefficient < 0) for negative effect 
- > 95%: Strong evidence 
- > 99%: Very strong evidence 
Comparison Table
Speed
Fast (milliseconds)
Slower (seconds to minutes)
Output
Point estimates
Full distributions
Uncertainty
Confidence intervals
Credible intervals
Priors
None
Incorporated
Interpretation
Coefficients, p-values
Posterior probabilities
Computation
Deterministic
Stochastic (MCMC)
Small samples
Can be unstable
More robust with priors
Multicollinearity
Problematic
Better handling with priors
Default choice
Yes
No (requires more setup)
Stakeholder familiarity
High
Low to moderate
When to Use Each
Use OLS When:
✅ Starting model development
- Quick iterations needed 
- Exploring variable combinations 
- Testing hypotheses rapidly 
✅ Simple models
- Few variables 
- Large sample size 
- Low multicollinearity 
✅ Stakeholder requirements
- Expect traditional statistics 
- Unfamiliar with Bayesian methods 
- P-values and t-stats are standard 
✅ No prior knowledge
- First time modeling this problem 
- No historical data or expert input 
- Want data to speak for itself 
Use Bayesian When:
✅ You have prior knowledge
- Historical models 
- Expert domain knowledge 
- Theoretical constraints (e.g., positive effects) 
✅ Need uncertainty quantification
- Risk assessment required 
- Confidence bounds for forecasts 
- Probabilistic statements needed 
✅ Complex models
- Many variables 
- High multicollinearity 
- Small sample size (priors provide regularization) 
✅ Final production models
- After exploratory OLS phase 
- For optimization and decision-making 
- When full uncertainty assessment valuable 
Workflow: OLS to Bayesian
Recommended approach for most projects:
Phase 1: OLS Exploration (Days 1-2)
- Build models with OLS 
- Test variable combinations rapidly 
- Find best specification 
- Identify stable, significant variables 
- Final OLS model: R² = 78%, all variables significant 
Phase 2: Bayesian Refinement (Day 3)
- Switch final model to Bayesian 
- Set weakly informative priors (mean=0, std=1000) 
- Add sign constraints for marketing (Half-Normal positive) 
- Run Bayesian inference 
- Check convergence (R-hat < 1.01) 
Phase 3: Bayesian Analysis (Day 4)
- Review posterior distributions 
- Calculate posterior probabilities 
- Generate uncertainty-aware forecasts 
- Run optimization with uncertainty 
- Present results with credible intervals 
Benefits of this workflow:
- Fast exploration with OLS 
- Robust final estimates with Bayesian 
- Best of both worlds 
- Stakeholder-friendly progression 
Common Questions
Can I switch back and forth?
Yes! Switch anytime without losing work.
- OLS results stored separately 
- Bayesian results stored separately 
- Switch to compare approaches 
- No data loss 
Do I need to rerun inference when I switch?
Switching TO Bayesian: Yes, run inference in Bayesian Model Interface
Switching TO OLS: No, OLS results already available
After adding/removing variables: Yes, rerun inference (Bayesian) or model refits automatically (OLS)
Will my model export with both results?
Yes, if you've run both:
- Export captures current model type 
- Both OLS and Bayesian results included in Excel 
- Can reimport and switch between them 
Which is "better"?
No universal answer. Depends on:
OLS advantages:
- Faster 
- Simpler 
- More familiar 
- Standard in industry 
Bayesian advantages:
- More flexible 
- Better uncertainty quantification 
- Handles complexity better 
- More theoretically principled 
Practical answer: Start with OLS, switch to Bayesian for final model if needed
Troubleshooting
"Bayesian feature not available"
Cause: Free tier doesn't include Bayesian
Solution:
- Upgrade to Professional or Enterprise 
- Click upgrade link in dialog 
- Or continue with OLS 
Bayesian results missing after switch
Cause: Haven't run MCMC inference yet
Solution:
- Navigate to Bayesian Model Interface 
- Click "Run Inference" 
- Wait for completion 
- Results now available 
OLS and Bayesian give very different results
Possible causes:
- Strong priors pulling estimates 
- Convergence issues in Bayesian 
- Small sample size 
Solutions:
- Check R-hat (should be < 1.05) 
- Use weaker priors (larger std dev) 
- Increase MCMC iterations 
- Compare with prior predictive checks 
Can't switch to Bayesian
Cause: Model has fixed coefficients (OLS feature)
Solution:
- Remove fixed coefficients 
- Set all to "Floating" 
- Then switch to Bayesian 
Key Takeaways
- OLS is faster and simpler - great for exploration 
- Bayesian provides uncertainty quantification and prior incorporation 
- Switch anytime without losing work 
- Recommended workflow: OLS exploration → Bayesian refinement 
- Must run MCMC inference after switching to Bayesian 
- Both use same model specification (variables) 
- Export includes results from both methods if run 
- Choose based on project needs and stakeholder requirements 
Last updated