Troubleshooting Failed Tests

Systematic Approach to Fixing Diagnostic Issues

When diagnostic tests fail, follow this systematic troubleshooting framework to identify root causes and implement effective solutions. The key is to address issues methodically rather than making random changes.

Step 1: Identify the Problem

Run All Diagnostics First:

Don't stop at first failure
Complete diagnostic picture reveals patterns
Related issues often have common causes

Prioritize by Severity:

Critical: R² < 0.50, VIF > 10, DW < 1.0
Important: Moderate violations of multiple tests
Minor: Single test with marginal failure

Understand Interdependencies:

Outliers can cause non-normality and heteroscedasticity
Missing variables can cause autocorrelation
Multicollinearity can inflate standard errors

Step 2: Diagnose Root Causes

Use this diagnostic decision tree:

Poor Model Fit (Low R²)

Symptom: R² < 0.60, high RMSE

Likely Causes:

Missing important marketing variables
Missing control variables (seasonality, trends, events)
Wrong functional form (need saturation/adstock)
Poor data quality
Structural breaks in relationships

Diagnostic Questions:

Which channels are missing?
Are there seasonal patterns not modeled?
Have we applied saturation curves?
Are there promotional periods not captured?

Solution Path:

Review business drivers not yet included
Add seasonal dummy variables
Apply saturation transformations
Include major events/promotions
Check data quality and coverage

Non-Normal Residuals

Symptom: Normality tests fail (p < 0.05), skewed Q-Q plot

Likely Causes:

Outliers or extreme values
Skewed dependent variable
Missing variables causing systematic errors
Wrong model specification

Diagnostic Questions:

What is the skewness value (positive/negative)?
Are there obvious outliers in the data?
Is the KPI naturally skewed (e.g., sales data)?
Do residuals show patterns in the time series plot?

Solution Path:

Check Influential Points diagnostic
Consider log transformation of KPI
Add missing variables
Add dummy variables for unusual periods
Use robust regression methods

Autocorrelation

Symptom: DW < 1.5 or > 2.5, significant Breusch-Godfrey

Likely Causes:

Insufficient adstock decay
Missing lagged effects
Omitted time trend
Missing seasonal patterns
Carryover effects not captured

Diagnostic Questions:

Are adstock rates too low?
Are there obvious time patterns in residual plot?
Does KPI have momentum/persistence?
Are seasonal effects modeled?

Solution Path:

Increase adstock decay rates (try 60-80% for TV)
Add lagged dependent variable
Include time trend (linear or quadratic)
Add monthly/quarterly seasonality
Test distributed lag structures

Heteroscedasticity

Symptom: Heteroscedasticity tests fail, fan/funnel pattern

Likely Causes:

Variance increases with KPI level
Multiplicative error structure
Missing variables affecting variance
Natural heterogeneity in data

Diagnostic Questions:

Does variance increase with fitted values?
Is KPI naturally heteroscedastic (e.g., sales)?
Are there different variance regimes (high/low spend)?
Do certain periods have more volatility?

Solution Path:

Log transform KPI (most effective for MMM)
Transform predictor variables (especially spend)
Use weighted least squares
Apply robust standard errors
Add variables explaining variance

Multicollinearity

Symptom: VIF > 10, unexpected coefficient signs, unstable estimates

Likely Causes:

Channels running together (coordinated campaigns)
Multiple channels capturing same effect
Variables measuring similar constructs
Created variables correlated with originals

Diagnostic Questions:

Which variable pairs have VIF > 10?
Do these channels typically run together?
Are we double-counting effects?
Can variables be meaningfully combined?

Solution Path:

Remove one variable from correlated pairs
Combine channels into aggregates (e.g., "Digital_Total")
Use PCA for highly correlated groups
Accept moderate VIF if prediction is goal
Consider sequential modeling

Too Many Influential Points

Symptom: >10% observations flagged, unstable model

Likely Causes:

Data quality issues (errors, duplicates)
Major events not modeled
Structural breaks
Promotional periods

Diagnostic Questions:

Are these data entry errors?
Do influential points have business explanations?
Are there special events those weeks?
Is there a structural change in the relationship?

Solution Path:

Verify data accuracy
Add dummy variables for special events
Split analysis into before/after periods
Remove confirmed data errors
Use robust regression methods

Common Scenarios and Solutions

Scenario 1: High R² but Failed Normality and Heteroscedasticity

What's Happening:

Model fits well overall
Error distribution is problematic
Likely skewed dependent variable

Solution:

1. Log transform the KPI
2. Re-fit the model
3. Re-run diagnostics
4. Results should improve dramatically

Expected Outcome: Both tests should pass after log transformation

Scenario 2: Moderate R² with Severe Autocorrelation

What's Happening:

Missing temporal dynamics
Model doesn't capture persistence
Standard errors unreliable

Solution:

1. Check current adstock settings
2. Increase decay rates (try 70% for TV, 50% for digital)
3. Add lagged KPI variable if still present
4. Include quarterly dummies
5. Re-run diagnostics

Expected Outcome: DW should move toward 2.0, R² should increase

Scenario 3: Low R² with Multiple Variable VIF > 10

What's Happening:

Multicollinearity masking true effects
Can't isolate channel impacts
Need to simplify model

Solution:

1. Review correlation matrix
2. Identify correlated channel groups
3. Combine: TV + Display → "Brand_Media"
4. Combine: Search + Social → "Performance_Media"
5. Re-fit simplified model
6. Check if R² improves and VIF decreases

Expected Outcome: VIF < 5, clearer coefficients, possibly higher R²

Scenario 4: Good Statistics but Coefficients Don't Make Sense

What's Happening:

Statistical diagnostics pass
But business validation fails
Model is mathematically correct but practically wrong

Solution:

1. Review coefficient signs and magnitudes
2. Check for omitted variable bias
3. Test for specification errors
4. Validate with business stakeholders
5. Add constraints or priors (Bayesian)
6. Split analysis by time period

Expected Outcome: Coefficients align with business intuition

Scenario 5: Everything Passes but R² is Only 0.55

What's Happening:

No technical problems
But explanatory power is limited
Missing important drivers

Solution:

1. Brainstorm missing variables with stakeholders
2. Add competitor activity data if available
3. Include more granular seasonality
4. Add promotional flags
5. Include macro-economic indicators
6. Test interaction terms

Expected Outcome: R² increases to 0.65-0.75 range

Detailed Solution Recipes

Recipe 1: Fixing Non-Normality

Step-by-step:

Identify skewness direction
- Positive skew → Try log transformation
- Negative skew → Uncommon, check for issues
Transform KPI
- In Variable Engineering: Create log(KPI)
- Re-fit model with log(KPI) as dependent variable
Interpret in new scale
- Coefficients now represent % change
- Elasticities are more intuitive
Verify improvement
- Run normality tests again
- Check Q-Q plot

Alternative if log doesn't work:

Square root transformation
Box-Cox transformation
Add outlier dummy variables

Recipe 2: Fixing Autocorrelation

Step-by-step:

Increase adstock decay
- TV: Try 70-80% (currently 50-60%)
- Radio: Try 60-70%
- Digital: Try 40-50%
Add time structure
- Create monthly dummy variables
- Add linear time trend
- Include quarterly seasonality
Test lagged KPI
- Create lag_1_KPI variable
- Add to model
- Check if DW improves
Verify improvement
- DW should be 1.7-2.3
- ACF plot should show no patterns

Alternative if still present:

Use robust standard errors
Try ARIMA error structure
Model state-space form

Recipe 3: Fixing Multicollinearity

Step-by-step:

Identify problematic pairs
- Review VIF table
- Check correlation matrix
- Find correlations > 0.80

Make strategic combinations

If TV VIF=12 and Display VIF=14:
- Create: Brand_Media = TV + Display
- Remove: TV and Display individually

If Search VIF=11 and Social VIF=10:
- Create: Performance_Media = Search + Social
- Remove: Search and Social individually

Re-fit and check
- All VIF should drop below 5
- R² may stay similar or improve
- Coefficients become interpretable
Adjust business interpretation
- Now analyzing combined effects
- Can decompose later if needed

Alternative approaches:

Ridge regression (penalizes correlations)
Sequential model building
Principal component analysis

Recipe 4: Fixing Heteroscedasticity

Step-by-step:

Log transform KPI (most effective)
- Stabilizes variance
- Natural for marketing (multiplicative effects)
Check residual plot
- Should show constant spread
- No fan/funnel pattern
If log insufficient:
- Transform predictors too
- Use weighted least squares
- Apply robust standard errors
Verify improvement
- Breusch-Pagan should pass
- Residual vs fitted shows random scatter

Alternative if transformation not desired:

Use heteroscedasticity-robust standard errors (HC3)
Weighted least squares
Generalized least squares

Recipe 5: Addressing Influential Points

Step-by-step:

Identify influential observations
- Check Cook's D > 4/n
- Review business context for those weeks
Investigate each point
- Data error? → Correct or remove
- Special event? → Add dummy variable
- Legitimate extreme? → Keep but document

Add event dummies

Create variables:
- Black_Friday_Week
- COVID_Lockdown
- Product_Launch_Week

Re-fit model
- Influential points should decrease
- R² may improve
- Results more stable

When to remove observations:

Confirmed data errors
Duplicates
Impossible values

When to keep:

Real business variation
Important historical events
Legitimate outliers

Prevention Strategies

Avoid problems before they occur:

Data Preparation

Clean data thoroughly before modeling
Check for outliers and errors
Ensure sufficient time series length (52+ weeks)
Verify spend data matches business records

Variable Selection

Start with theoretically important variables
Don't include highly correlated channels together initially
Apply appropriate transformations (log, adstock, saturation)
Test variables before adding to final model

Model Building

Build incrementally (don't add all variables at once)
Run diagnostics after each major change
Keep models parsimonious
Validate with business stakeholders throughout

Documentation

Track all changes and why they were made
Document diagnostic results at each iteration
Maintain rationale for variable inclusion/exclusion
Record business validation discussions

When to Seek Help

Some situations require additional expertise:

Complex Issues:

Multiple severe violations that don't improve
Contradictory diagnostic results
Business requirements conflict with statistical best practices

Advanced Techniques Needed:

State-space models for complex dynamics
Bayesian methods with informative priors
Panel data or hierarchical structures

Domain Expertise:

Industry-specific modeling approaches
Regulatory or compliance requirements
Advanced causal inference methods

Diagnostic Troubleshooting Checklist

Use this checklist to systematically work through issues:

[ ] Identified all failed tests and severity
[ ] Prioritized issues (critical → important → minor)
[ ] Diagnosed root causes for each failure
[ ] Attempted primary solution for each issue
[ ] Re-run diagnostics after each change
[ ] Documented all changes and rationale
[ ] Verified business validity of results
[ ] Communicated remaining limitations
[ ] Obtained stakeholder sign-off
[ ] Saved final diagnostic reports

Summary: Most Effective Solutions

Based on common MMM diagnostic issues:

Problem → Solution:

Issue

Most Effective Fix

Success Rate

Non-normality

Log transform KPI

85%

Autocorrelation

Increase adstock + add lag

75%

Heteroscedasticity

Log transform KPI

80%

Multicollinearity

Combine correlated channels

90%

Low R²

Add missing variables

70%

Influential points

Add event dummies

65%

Remember: Most diagnostic issues in MMM are solved by proper transformations (especially log) and ensuring all relevant business drivers are included.

PreviousInterpreting Test Results NextContribution Groups Setup

Last updated 26 days ago