Model Comparison

Evaluate Model Improvements and Variable Impact Side-by-Side

Overview

Model Comparison allows you to analyze two models simultaneously, comparing coefficients, statistical significance, and variable presence. This feature is essential for evaluating model iterations, understanding variable impact, and making data-driven decisions about model specifications.

Purpose: Compare two model versions to understand differences, improvements, and variable effects

Access: Model Library → Select exactly 2 models → Click "Compare Models"

When to Use Model Comparison

Evaluating Model Iterations

Scenario: You've built v1 and v2 of your model

Questions to answer:

  • Did R² improve in v2?

  • Which variables changed significantly?

  • Are new variables contributing meaningfully?

  • Did existing coefficients shift (multicollinearity indicator)?

Testing Variable Additions

Scenario: You added new marketing channel or control variable

Questions to answer:

  • Did the new variable improve model fit?

  • Did it change coefficients of existing variables?

  • Is it statistically significant?

  • Is the magnitude reasonable?

Comparing Transformation Strategies

Scenario: Model A uses raw variables, Model B uses curves/adstock

Questions to answer:

  • Do transformations improve fit (R² comparison)?

  • How do transformed variable coefficients differ?

  • Are transformed variables more significant?

  • Does business interpretation change?

Seasonal vs. Full-Year Models

Scenario: Comparing Q4-only model to full-year model

Questions to answer:

  • Do channel effects differ seasonally?

  • Which variables become more/less important?

  • Is there seasonal variation in coefficients?

  • Should you model seasons separately?

How to Compare Models

Step 1: Select Two Models

In Model Library table:

  1. Check the box next to first model

  2. Check the box next to second model

  3. Leave all other models unchecked

Important: Must select exactly 2 models

  • If 0, 1, or 3+ models selected, Compare button disabled

  • Instructions shown if wrong number selected

Step 2: Click "Compare Models" Button

Button located in the top action bar.

A comparison dialog or panel opens showing side-by-side analysis.

Step 3: Review Comparison Table

The comparison interface displays variables in rows with columns for each model:

Columns shown:

  • Variable Name: Name of independent variable

  • Model 1 Coefficient: β value in first model

  • Model 1 T-Stat: Statistical significance in first model

  • Model 2 Coefficient: β value in second model

  • Model 2 T-Stat: Statistical significance in second model

  • Coefficient Change (%): Percentage change between models

  • T-Stat Change: Difference in significance

  • In Model 1: ✓ if variable present

  • In Model 2: ✓ if variable present

Understanding the Comparison

Coefficient Changes

Percentage change calculation: Change = ((Model2_Coef - Model1_Coef) / Model1_Coef) × 100%

Interpretation:

Small changes (<10%): ✓ Good sign - coefficients stable ✓ Model structure robust ✓ Variables not significantly affected

Moderate changes (10-30%): ⚠️ Some structural change ⚠️ Review new variables added ⚠️ Check for multicollinearity

Large changes (>30%): 🚨 Significant structural shift 🚨 Likely multicollinearity introduced 🚨 May indicate model instability 🚨 Investigate variable relationships

Sign flip (+ to - or vice versa): 🚨 Red flag - serious issue 🚨 Check for perfect multicollinearity 🚨 Review variable definitions 🚨 Validate data quality

T-Statistic Changes

T-stat measures statistical reliability of coefficient estimate.

Interpretation:

T-stat increased: ✓ Variable became more significant ✓ Improved model specification ✓ Better data fit

T-stat decreased slightly: ⚠️ Variable less reliable ⚠️ May be multicollinearity ⚠️ Check VIF statistics

T-stat dropped below 2.0: 🚨 Variable no longer significant 🚨 Consider removing from model 🚨 May be redundant with new variables

T-stat became negative (from positive): 🚨 Serious structural issue 🚨 Mathematical problem likely 🚨 Investigate immediately

Variable Presence

Variable in Model 1 but not Model 2:

  • Variable was removed

  • Check if removal improved R²

  • Confirm removal was intentional

Variable in Model 2 but not Model 1:

  • New variable added

  • Review its coefficient and significance

  • Assess contribution to R²

Variable in both models:

  • Compare coefficients and t-stats

  • Look for stability or concerning changes

  • Validate business interpretation consistency

Comparison Scenarios

Scenario 1: Adding New Variable

Setup:

  • Model 1: Sales ~ TV + Digital + Seasonality

  • Model 2: Sales ~ TV + Digital + Radio + Seasonality

What to look for:

Radio coefficient:

  • Is it positive? (More radio → more sales)

  • Is it significant? (T-stat > 2.0)

  • Is magnitude reasonable? (Compared to other channels)

Existing variable changes:

  • Did TV/Digital coefficients change much?

  • <10% change: Good, model stable

  • >30% change: Multicollinearity warning

Model fit:

  • Did R² increase?

  • >2% increase: Meaningful improvement

  • <1% increase: Radio adds little value

Decision:

  • Keep Radio if significant, reasonable coefficient, and stable model

  • Remove Radio if non-significant or causes instability

Scenario 2: Applying Transformation

Setup:

  • Model 1: Sales ~ TV_Spend + Digital_Spend

  • Model 2: Sales ~ TV_Spend|ICP_ATAN + Digital_Spend|ADBUG_ATAN

What to look for:

R² improvement:

  • Curves should increase R² significantly

  • +5-15% R²: Curves add substantial value

  • <3% R²: Curves may be unnecessary

Coefficient interpretation:

  • Curved variable coefficients not directly comparable to raw

  • Focus on significance and sign

  • Use decomposition for ROI comparison

Business sense:

  • Do curve shapes match channel behavior?

  • ICP for brand TV makes sense (threshold effects)

  • ADBUG for performance digital makes sense (immediate diminishing returns)

Decision:

  • Keep curves if significant R² improvement and business logic sound

  • Revert to linear if minimal improvement

Scenario 3: Simplifying Model

Setup:

  • Model 1: Sales ~ TV + Digital + Radio + Print + Outdoor + 10 controls

  • Model 2: Sales ~ TV + Digital + Radio + Key_Controls (removed weak variables)

What to look for:

R² change:

  • <3% decrease: Simplification successful

  • >5% decrease: Removed variables were important

Remaining variable stability:

  • Did coefficients of kept variables change much?

  • Stable: Good, removed variables were truly redundant

  • Large changes: Removed variables were suppressing/confounding

Significance improvement:

  • Did t-stats of remaining variables increase?

  • Removing multicollinear variables often improves significance

Decision:

  • Keep simplified model if R² loss < 3% and coefficients more stable

  • Revert if R² drops too much or signs flip

Scenario 4: Seasonal Split

Setup:

  • Model 1: Full year model

  • Model 2: Q4-only model (filtered observations)

What to look for:

Coefficient differences:

  • Do channel effects differ seasonally?

  • TV coefficient higher in Q4: Seasonal lift from holiday messaging

  • Digital stable: Less seasonal variation

Significance changes:

  • Are some channels only significant in Q4?

  • Indicates seasonal channel strategy needed

R² comparison:

  • Not directly comparable (different data)

  • Focus on coefficient patterns

Decision:

  • If large seasonal differences, model seasons separately

  • If coefficients similar, full-year model sufficient

Best Practices

Before Comparing

Ensure models are comparable:

  • Same KPI (dependent variable)

  • Similar time periods (unless testing seasonality)

  • Same data source

  • Similar specification (not comparing apples to oranges)

Know what changed:

  • Document differences between models

  • List variables added/removed

  • Note transformation changes

  • Record motivation for changes

During Comparison

Focus on key metrics:

  1. Overall fit: R² change

  2. Coefficient stability: <10% change is good

  3. Significance: T-stats > 2.0

  4. Sign consistency: Coefficients should keep same sign

  5. Business logic: Changes should make sense

Look for red flags:

  • Sign flips

  • 50% coefficient changes

  • T-stats dropping below 2.0

  • Unreasonable magnitudes

After Comparison

Document findings:

  • Which model is better and why

  • Key differences and implications

  • Decision rationale (keep v2, revert to v1, etc.)

  • Next steps for further improvement

Take action:

  • Keep better performing model

  • Delete inferior version

  • If results unclear, run diagnostics on both

  • Consider hybrid approach (some changes from v2, not all)

Interpreting Results

Model 2 is Better If:

✅ R² increased by >2% ✅ Coefficients remain stable (<10% change) ✅ New variables are significant ✅ Signs remain consistent ✅ Business interpretation improved ✅ Diagnostics improved (lower VIF, better residuals)

Model 1 is Better If:

✅ R² barely changed or decreased ✅ Coefficients became unstable (>30% changes) ✅ New variables non-significant ✅ Signs flipped unexpectedly ✅ Model became harder to interpret ✅ Diagnostics worsened

Need More Investigation If:

⚠️ R² improved but coefficients unstable ⚠️ R² decreased but coefficients more significant ⚠️ Mixed results (some variables better, some worse) ⚠️ Unexpected patterns that don't match theory

Next steps: Run full diagnostics, check VIF, review data quality

Comparison Limitations

What Comparison Doesn't Show

Diagnostic quality: Must check separately (normality, autocorrelation, heteroscedasticity) ❌ Multicollinearity: Use VIF testing in Model Builder ❌ Residual patterns: Check diagnostic plots ❌ Out-of-sample performance: Comparison uses same data for both ❌ Decomposition results: How variables contribute to KPI over time

When to Use Other Tools

For multicollinearity: Variable Testing → VIF Analysis For residual quality: Model Diagnostics For contribution analysis: Decomposition Analysis For causality testing: Variable Testing → Granger Causality

Key Takeaways

  • Compare exactly 2 models side-by-side to evaluate changes

  • Focus on R² improvement, coefficient stability, and significance

  • Coefficient changes <10% indicate stable model structure

  • Sign flips and >30% changes are red flags requiring investigation

  • Use comparison to validate model iterations and variable additions

  • Document findings and decision rationale for each comparison

  • Combine with diagnostics and VIF testing for comprehensive evaluation

Last updated