Model Comparison
Evaluate Model Improvements and Variable Impact Side-by-Side
Overview
Model Comparison allows you to analyze two models simultaneously, comparing coefficients, statistical significance, and variable presence. This feature is essential for evaluating model iterations, understanding variable impact, and making data-driven decisions about model specifications.
Purpose: Compare two model versions to understand differences, improvements, and variable effects
Access: Model Library → Select exactly 2 models → Click "Compare Models"
When to Use Model Comparison
Evaluating Model Iterations
Scenario: You've built v1 and v2 of your model
Questions to answer:
- Did R² improve in v2? 
- Which variables changed significantly? 
- Are new variables contributing meaningfully? 
- Did existing coefficients shift (multicollinearity indicator)? 
Testing Variable Additions
Scenario: You added new marketing channel or control variable
Questions to answer:
- Did the new variable improve model fit? 
- Did it change coefficients of existing variables? 
- Is it statistically significant? 
- Is the magnitude reasonable? 
Comparing Transformation Strategies
Scenario: Model A uses raw variables, Model B uses curves/adstock
Questions to answer:
- Do transformations improve fit (R² comparison)? 
- How do transformed variable coefficients differ? 
- Are transformed variables more significant? 
- Does business interpretation change? 
Seasonal vs. Full-Year Models
Scenario: Comparing Q4-only model to full-year model
Questions to answer:
- Do channel effects differ seasonally? 
- Which variables become more/less important? 
- Is there seasonal variation in coefficients? 
- Should you model seasons separately? 
How to Compare Models
Step 1: Select Two Models
In Model Library table:
- Check the box next to first model 
- Check the box next to second model 
- Leave all other models unchecked 
Important: Must select exactly 2 models
- If 0, 1, or 3+ models selected, Compare button disabled 
- Instructions shown if wrong number selected 
Step 2: Click "Compare Models" Button
Button located in the top action bar.
A comparison dialog or panel opens showing side-by-side analysis.
Step 3: Review Comparison Table
The comparison interface displays variables in rows with columns for each model:
Columns shown:
- Variable Name: Name of independent variable 
- Model 1 Coefficient: β value in first model 
- Model 1 T-Stat: Statistical significance in first model 
- Model 2 Coefficient: β value in second model 
- Model 2 T-Stat: Statistical significance in second model 
- Coefficient Change (%): Percentage change between models 
- T-Stat Change: Difference in significance 
- In Model 1: ✓ if variable present 
- In Model 2: ✓ if variable present 
Understanding the Comparison
Coefficient Changes
Percentage change calculation:
Change = ((Model2_Coef - Model1_Coef) / Model1_Coef) × 100%
Interpretation:
Small changes (<10%): ✓ Good sign - coefficients stable ✓ Model structure robust ✓ Variables not significantly affected
Moderate changes (10-30%): ⚠️ Some structural change ⚠️ Review new variables added ⚠️ Check for multicollinearity
Large changes (>30%): 🚨 Significant structural shift 🚨 Likely multicollinearity introduced 🚨 May indicate model instability 🚨 Investigate variable relationships
Sign flip (+ to - or vice versa): 🚨 Red flag - serious issue 🚨 Check for perfect multicollinearity 🚨 Review variable definitions 🚨 Validate data quality
T-Statistic Changes
T-stat measures statistical reliability of coefficient estimate.
Interpretation:
T-stat increased: ✓ Variable became more significant ✓ Improved model specification ✓ Better data fit
T-stat decreased slightly: ⚠️ Variable less reliable ⚠️ May be multicollinearity ⚠️ Check VIF statistics
T-stat dropped below 2.0: 🚨 Variable no longer significant 🚨 Consider removing from model 🚨 May be redundant with new variables
T-stat became negative (from positive): 🚨 Serious structural issue 🚨 Mathematical problem likely 🚨 Investigate immediately
Variable Presence
Variable in Model 1 but not Model 2:
- Variable was removed 
- Check if removal improved R² 
- Confirm removal was intentional 
Variable in Model 2 but not Model 1:
- New variable added 
- Review its coefficient and significance 
- Assess contribution to R² 
Variable in both models:
- Compare coefficients and t-stats 
- Look for stability or concerning changes 
- Validate business interpretation consistency 
Comparison Scenarios
Scenario 1: Adding New Variable
Setup:
- Model 1: Sales ~ TV + Digital + Seasonality 
- Model 2: Sales ~ TV + Digital + Radio + Seasonality 
What to look for:
Radio coefficient:
- Is it positive? (More radio → more sales) 
- Is it significant? (T-stat > 2.0) 
- Is magnitude reasonable? (Compared to other channels) 
Existing variable changes:
- Did TV/Digital coefficients change much? 
- <10% change: Good, model stable 
- >30% change: Multicollinearity warning 
Model fit:
- Did R² increase? 
- >2% increase: Meaningful improvement 
- <1% increase: Radio adds little value 
Decision:
- Keep Radio if significant, reasonable coefficient, and stable model 
- Remove Radio if non-significant or causes instability 
Scenario 2: Applying Transformation
Setup:
- Model 1: Sales ~ TV_Spend + Digital_Spend 
- Model 2: Sales ~ TV_Spend|ICP_ATAN + Digital_Spend|ADBUG_ATAN 
What to look for:
R² improvement:
- Curves should increase R² significantly 
- +5-15% R²: Curves add substantial value 
- <3% R²: Curves may be unnecessary 
Coefficient interpretation:
- Curved variable coefficients not directly comparable to raw 
- Focus on significance and sign 
- Use decomposition for ROI comparison 
Business sense:
- Do curve shapes match channel behavior? 
- ICP for brand TV makes sense (threshold effects) 
- ADBUG for performance digital makes sense (immediate diminishing returns) 
Decision:
- Keep curves if significant R² improvement and business logic sound 
- Revert to linear if minimal improvement 
Scenario 3: Simplifying Model
Setup:
- Model 1: Sales ~ TV + Digital + Radio + Print + Outdoor + 10 controls 
- Model 2: Sales ~ TV + Digital + Radio + Key_Controls (removed weak variables) 
What to look for:
R² change:
- <3% decrease: Simplification successful 
- >5% decrease: Removed variables were important 
Remaining variable stability:
- Did coefficients of kept variables change much? 
- Stable: Good, removed variables were truly redundant 
- Large changes: Removed variables were suppressing/confounding 
Significance improvement:
- Did t-stats of remaining variables increase? 
- Removing multicollinear variables often improves significance 
Decision:
- Keep simplified model if R² loss < 3% and coefficients more stable 
- Revert if R² drops too much or signs flip 
Scenario 4: Seasonal Split
Setup:
- Model 1: Full year model 
- Model 2: Q4-only model (filtered observations) 
What to look for:
Coefficient differences:
- Do channel effects differ seasonally? 
- TV coefficient higher in Q4: Seasonal lift from holiday messaging 
- Digital stable: Less seasonal variation 
Significance changes:
- Are some channels only significant in Q4? 
- Indicates seasonal channel strategy needed 
R² comparison:
- Not directly comparable (different data) 
- Focus on coefficient patterns 
Decision:
- If large seasonal differences, model seasons separately 
- If coefficients similar, full-year model sufficient 
Best Practices
Before Comparing
Ensure models are comparable:
- Same KPI (dependent variable) 
- Similar time periods (unless testing seasonality) 
- Same data source 
- Similar specification (not comparing apples to oranges) 
Know what changed:
- Document differences between models 
- List variables added/removed 
- Note transformation changes 
- Record motivation for changes 
During Comparison
Focus on key metrics:
- Overall fit: R² change 
- Coefficient stability: <10% change is good 
- Significance: T-stats > 2.0 
- Sign consistency: Coefficients should keep same sign 
- Business logic: Changes should make sense 
Look for red flags:
- Sign flips 
- 50% coefficient changes 
- T-stats dropping below 2.0 
- Unreasonable magnitudes 
After Comparison
Document findings:
- Which model is better and why 
- Key differences and implications 
- Decision rationale (keep v2, revert to v1, etc.) 
- Next steps for further improvement 
Take action:
- Keep better performing model 
- Delete inferior version 
- If results unclear, run diagnostics on both 
- Consider hybrid approach (some changes from v2, not all) 
Interpreting Results
Model 2 is Better If:
✅ R² increased by >2% ✅ Coefficients remain stable (<10% change) ✅ New variables are significant ✅ Signs remain consistent ✅ Business interpretation improved ✅ Diagnostics improved (lower VIF, better residuals)
Model 1 is Better If:
✅ R² barely changed or decreased ✅ Coefficients became unstable (>30% changes) ✅ New variables non-significant ✅ Signs flipped unexpectedly ✅ Model became harder to interpret ✅ Diagnostics worsened
Need More Investigation If:
⚠️ R² improved but coefficients unstable ⚠️ R² decreased but coefficients more significant ⚠️ Mixed results (some variables better, some worse) ⚠️ Unexpected patterns that don't match theory
Next steps: Run full diagnostics, check VIF, review data quality
Comparison Limitations
What Comparison Doesn't Show
❌ Diagnostic quality: Must check separately (normality, autocorrelation, heteroscedasticity) ❌ Multicollinearity: Use VIF testing in Model Builder ❌ Residual patterns: Check diagnostic plots ❌ Out-of-sample performance: Comparison uses same data for both ❌ Decomposition results: How variables contribute to KPI over time
When to Use Other Tools
For multicollinearity: Variable Testing → VIF Analysis For residual quality: Model Diagnostics For contribution analysis: Decomposition Analysis For causality testing: Variable Testing → Granger Causality
Key Takeaways
- Compare exactly 2 models side-by-side to evaluate changes 
- Focus on R² improvement, coefficient stability, and significance 
- Coefficient changes <10% indicate stable model structure 
- Sign flips and >30% changes are red flags requiring investigation 
- Use comparison to validate model iterations and variable additions 
- Document findings and decision rationale for each comparison 
- Combine with diagnostics and VIF testing for comprehensive evaluation 
Last updated