Heteroscedasticity
What Heteroscedasticity Tests Check
Heteroscedasticity tests examine whether the variance of model residuals is constant across all levels of the independent variables (homoscedasticity) or varies systematically (heteroscedasticity).
Purpose: Tests if error variance is constant across fitted values, ensuring prediction intervals and hypothesis tests are accurate.
Why Constant Variance Matters
When residuals have constant variance (homoscedasticity):
Standard Errors are Correct: Coefficient standard errors accurately reflect uncertainty
Confidence Intervals are Valid: Intervals have correct coverage probabilities
Hypothesis Tests are Reliable: P-values and significance tests are trustworthy
Predictions are Optimal: Ordinary Least Squares (OLS) provides the most efficient estimates
Heteroscedasticity doesn't bias coefficient estimates, but it makes standard errors incorrect, leading to invalid confidence intervals and hypothesis tests.
Statistical Tests Available
MixModeler provides three heteroscedasticity tests:
Breusch-Pagan Test
Tests for linear relationship between squared residuals and predictors
p < 0.05 indicates heteroscedasticity
White Test
More general test for any form of heteroscedasticity
p < 0.05 indicates heteroscedasticity
Goldfeld-Quandt Test
Tests whether variance differs across subsamples
p < 0.05 indicates heteroscedasticity
Interpretation: If p-value ≥ 0.05, no significant heteroscedasticity detected. If p < 0.05, heteroscedasticity is present.
Visual Diagnostics
MixModeler provides key visualizations for detecting heteroscedasticity:
Residuals vs Fitted Values Plot:
- Plots residuals against predicted values 
- Good: Random scatter around zero with consistent spread (no pattern) 
- Problem: Funnel shape (variance increases/decreases with fitted values) 
- Problem: Curved pattern (indicates non-linear relationship) 
Scale-Location Plot:
- Plots square root of absolute standardized residuals against fitted values 
- Good: Horizontal line with random scatter 
- Problem: Upward or downward trend indicates changing variance 
Common Patterns
Fan Shape (Most Common):
- Variance increases as predicted values increase 
- Often seen when predicting sales (larger values have more variability) 
Funnel Shape:
- Variance decreases as predicted values increase 
- Less common but still problematic 
Grouped Heteroscedasticity:
- Different variance for different categories or time periods 
- Suggests need for categorical variables or interactions 
Interpreting Test Results
Passed Tests (✓)
What it means: No significant heteroscedasticity detected (p ≥ 0.05)
Implications:
- Variance of errors is reasonably constant 
- Standard errors are reliable 
- Confidence intervals and p-values are valid 
Action: No action needed - homoscedasticity assumption is satisfied
Failed Tests (⚠)
What it means: Heteroscedasticity detected (p < 0.05)
Implications:
- Standard errors may be incorrect (typically too small) 
- Confidence intervals may have wrong coverage 
- Hypothesis tests may be unreliable 
- Note: Coefficient estimates remain unbiased 
Common Causes:
- Dependent variable has increasing/decreasing variance 
- Important variables omitted 
- Wrong functional form (need transformations) 
- Outliers affecting variance 
- Natural heteroscedasticity in the data-generating process 
What to Do When Tests Fail
If heteroscedasticity tests fail, try these solutions in order:
1. Transform the Dependent Variable (Most Effective)
- Log transformation: Reduces right skewness and stabilizes variance - Good for: Sales, revenue, spend data 
- Effect: Multiplicative relationships become additive 
 
- Square root transformation: Moderate variance stabilization 
- Inverse transformation: For highly skewed data 
2. Use Weighted Least Squares (WLS)
- Give less weight to observations with higher variance 
- Requires estimating variance function 
- Available in advanced statistical software 
3. Use Robust Standard Errors
- Calculate heteroscedasticity-consistent (HC) standard errors 
- Corrects standard errors without changing coefficients 
- Common variants: HC0, HC1, HC2, HC3 
4. Add Omitted Variables
- Include variables that explain variance patterns 
- Add interaction terms 
- Consider non-linear transformations 
5. Check for Outliers
- Review Influential Points diagnostic 
- Outliers can create false heteroscedasticity patterns 
- Consider robust regression methods 
6. When Heteroscedasticity is Acceptable
- Mild heteroscedasticity with large sample sizes 
- Focus on coefficient estimates rather than inference 
- Using robust standard errors in practice 
- Business decisions not sensitive to exact confidence intervals 
Practical Guidelines
Acceptable Scenarios:
- Mild heteroscedasticity (p-value between 0.01-0.05) 
- Large datasets where robust standard errors are used 
- Predictions are the primary goal (coefficients remain unbiased) 
- Natural variation in business data 
Critical Issues:
- Severe heteroscedasticity (p < 0.001) 
- Clear fan or funnel pattern in residual plots 
- Small sample sizes requiring precise inference 
- Using model for confidence interval construction 
Example Interpretation
Scenario 1 - Passed:
- Breusch-Pagan p-value: 0.28 
- White test p-value: 0.42 
- Residual plot shows random scatter with consistent spread 
Interpretation: No significant heteroscedasticity detected. The constant variance assumption is satisfied, and standard errors are reliable.
Scenario 2 - Failed:
- Breusch-Pagan p-value: 0.008 
- White test p-value: 0.003 
- Residual plot shows clear fan shape (increasing variance) 
Interpretation: Heteroscedasticity detected. Variance increases with fitted values. Consider log-transforming the KPI or using robust standard errors. If focused on coefficient interpretation rather than p-values, this may be acceptable with appropriate caveats.
Scenario 3 - Severe:
- Breusch-Pagan p-value: < 0.001 
- White test p-value: < 0.001 
- Strong funnel pattern with extreme variance differences 
Interpretation: Severe heteroscedasticity. Model requires transformation. Try log-transforming the dependent variable before proceeding with inference or decision-making.
Marketing Mix Modeling Context
In MMM, heteroscedasticity often appears because:
Sales Variability: Larger sales periods naturally have higher variance
Promotional Effects: Promotions create volatility in certain periods
Seasonal Patterns: Different variance across seasons
Spend Ranges: Different variance at low vs. high spend levels
Log transformation of the KPI is particularly effective in MMM as it:
- Stabilizes variance 
- Makes relationships multiplicative (natural for marketing effects) 
- Reduces influence of outliers 
- Provides coefficients interpretable as elasticities 
Relationship to Other Assumptions
Heteroscedasticity often co-occurs with:
Non-normality: Changing variance can cause residuals to deviate from normality
Outliers: Extreme values can create both heteroscedasticity and influential points
Non-linearity: Wrong functional form can manifest as heteroscedasticity
Related Diagnostics
After reviewing heteroscedasticity:
- Check Residual Normality as heteroscedasticity can affect normality 
- Review Influential Points to identify outliers causing variance issues 
- Examine Actual vs Predicted for systematic patterns 
Last updated