Heteroscedasticity
What Heteroscedasticity Tests Check
Heteroscedasticity tests examine whether the variance of model residuals is constant across all levels of the independent variables (homoscedasticity) or varies systematically (heteroscedasticity).
Purpose: Tests if error variance is constant across fitted values, ensuring prediction intervals and hypothesis tests are accurate.
Why Constant Variance Matters
When residuals have constant variance (homoscedasticity):
Standard Errors are Correct: Coefficient standard errors accurately reflect uncertainty
Confidence Intervals are Valid: Intervals have correct coverage probabilities
Hypothesis Tests are Reliable: P-values and significance tests are trustworthy
Predictions are Optimal: Ordinary Least Squares (OLS) provides the most efficient estimates
Heteroscedasticity doesn't bias coefficient estimates, but it makes standard errors incorrect, leading to invalid confidence intervals and hypothesis tests.
Statistical Tests Available
MixModeler provides three heteroscedasticity tests:
Breusch-Pagan Test
Tests for linear relationship between squared residuals and predictors
p < 0.05 indicates heteroscedasticity
White Test
More general test for any form of heteroscedasticity
p < 0.05 indicates heteroscedasticity
Goldfeld-Quandt Test
Tests whether variance differs across subsamples
p < 0.05 indicates heteroscedasticity
Interpretation: If p-value ≥ 0.05, no significant heteroscedasticity detected. If p < 0.05, heteroscedasticity is present.
Visual Diagnostics
MixModeler provides key visualizations for detecting heteroscedasticity:
Residuals vs Fitted Values Plot:
Plots residuals against predicted values
Good: Random scatter around zero with consistent spread (no pattern)
Problem: Funnel shape (variance increases/decreases with fitted values)
Problem: Curved pattern (indicates non-linear relationship)
Scale-Location Plot:
Plots square root of absolute standardized residuals against fitted values
Good: Horizontal line with random scatter
Problem: Upward or downward trend indicates changing variance
Common Patterns
Fan Shape (Most Common):
Variance increases as predicted values increase
Often seen when predicting sales (larger values have more variability)
Funnel Shape:
Variance decreases as predicted values increase
Less common but still problematic
Grouped Heteroscedasticity:
Different variance for different categories or time periods
Suggests need for categorical variables or interactions
Interpreting Test Results
Passed Tests (✓)
What it means: No significant heteroscedasticity detected (p ≥ 0.05)
Implications:
Variance of errors is reasonably constant
Standard errors are reliable
Confidence intervals and p-values are valid
Action: No action needed - homoscedasticity assumption is satisfied
Failed Tests (⚠)
What it means: Heteroscedasticity detected (p < 0.05)
Implications:
Standard errors may be incorrect (typically too small)
Confidence intervals may have wrong coverage
Hypothesis tests may be unreliable
Note: Coefficient estimates remain unbiased
Common Causes:
Dependent variable has increasing/decreasing variance
Important variables omitted
Wrong functional form (need transformations)
Outliers affecting variance
Natural heteroscedasticity in the data-generating process
What to Do When Tests Fail
If heteroscedasticity tests fail, try these solutions in order:
1. Transform the Dependent Variable (Most Effective)
Log transformation: Reduces right skewness and stabilizes variance
Good for: Sales, revenue, spend data
Effect: Multiplicative relationships become additive
Square root transformation: Moderate variance stabilization
Inverse transformation: For highly skewed data
2. Use Weighted Least Squares (WLS)
Give less weight to observations with higher variance
Requires estimating variance function
Available in advanced statistical software
3. Use Robust Standard Errors
Calculate heteroscedasticity-consistent (HC) standard errors
Corrects standard errors without changing coefficients
Common variants: HC0, HC1, HC2, HC3
4. Add Omitted Variables
Include variables that explain variance patterns
Add interaction terms
Consider non-linear transformations
5. Check for Outliers
Review Influential Points diagnostic
Outliers can create false heteroscedasticity patterns
Consider robust regression methods
6. When Heteroscedasticity is Acceptable
Mild heteroscedasticity with large sample sizes
Focus on coefficient estimates rather than inference
Using robust standard errors in practice
Business decisions not sensitive to exact confidence intervals
Practical Guidelines
Acceptable Scenarios:
Mild heteroscedasticity (p-value between 0.01-0.05)
Large datasets where robust standard errors are used
Predictions are the primary goal (coefficients remain unbiased)
Natural variation in business data
Critical Issues:
Severe heteroscedasticity (p < 0.001)
Clear fan or funnel pattern in residual plots
Small sample sizes requiring precise inference
Using model for confidence interval construction
Example Interpretation
Scenario 1 - Passed:
Breusch-Pagan p-value: 0.28
White test p-value: 0.42
Residual plot shows random scatter with consistent spread
Interpretation: No significant heteroscedasticity detected. The constant variance assumption is satisfied, and standard errors are reliable.
Scenario 2 - Failed:
Breusch-Pagan p-value: 0.008
White test p-value: 0.003
Residual plot shows clear fan shape (increasing variance)
Interpretation: Heteroscedasticity detected. Variance increases with fitted values. Consider log-transforming the KPI or using robust standard errors. If focused on coefficient interpretation rather than p-values, this may be acceptable with appropriate caveats.
Scenario 3 - Severe:
Breusch-Pagan p-value: < 0.001
White test p-value: < 0.001
Strong funnel pattern with extreme variance differences
Interpretation: Severe heteroscedasticity. Model requires transformation. Try log-transforming the dependent variable before proceeding with inference or decision-making.
Marketing Mix Modeling Context
In MMM, heteroscedasticity often appears because:
Sales Variability: Larger sales periods naturally have higher variance
Promotional Effects: Promotions create volatility in certain periods
Seasonal Patterns: Different variance across seasons
Spend Ranges: Different variance at low vs. high spend levels
Log transformation of the KPI is particularly effective in MMM as it:
Stabilizes variance
Makes relationships multiplicative (natural for marketing effects)
Reduces influence of outliers
Provides coefficients interpretable as elasticities
Relationship to Other Assumptions
Heteroscedasticity often co-occurs with:
Non-normality: Changing variance can cause residuals to deviate from normality
Outliers: Extreme values can create both heteroscedasticity and influential points
Non-linearity: Wrong functional form can manifest as heteroscedasticity
Related Diagnostics
After reviewing heteroscedasticity:
Check Residual Normality as heteroscedasticity can affect normality
Review Influential Points to identify outliers causing variance issues
Examine Actual vs Predicted for systematic patterns
Last updated