Heteroscedasticity

What Heteroscedasticity Tests Check

Heteroscedasticity tests examine whether the variance of model residuals is constant across all levels of the independent variables (homoscedasticity) or varies systematically (heteroscedasticity).

Purpose: Tests if error variance is constant across fitted values, ensuring prediction intervals and hypothesis tests are accurate.

Why Constant Variance Matters

When residuals have constant variance (homoscedasticity):

Standard Errors are Correct: Coefficient standard errors accurately reflect uncertainty

Confidence Intervals are Valid: Intervals have correct coverage probabilities

Hypothesis Tests are Reliable: P-values and significance tests are trustworthy

Predictions are Optimal: Ordinary Least Squares (OLS) provides the most efficient estimates

Heteroscedasticity doesn't bias coefficient estimates, but it makes standard errors incorrect, leading to invalid confidence intervals and hypothesis tests.

Statistical Tests Available

MixModeler provides three heteroscedasticity tests:

Test

Description

Interpretation

Breusch-Pagan Test

Tests for linear relationship between squared residuals and predictors

p < 0.05 indicates heteroscedasticity

White Test

More general test for any form of heteroscedasticity

p < 0.05 indicates heteroscedasticity

Goldfeld-Quandt Test

Tests whether variance differs across subsamples

p < 0.05 indicates heteroscedasticity

Interpretation: If p-value ≥ 0.05, no significant heteroscedasticity detected. If p < 0.05, heteroscedasticity is present.

Visual Diagnostics

MixModeler provides key visualizations for detecting heteroscedasticity:

Residuals vs Fitted Values Plot:

Plots residuals against predicted values
Good: Random scatter around zero with consistent spread (no pattern)
Problem: Funnel shape (variance increases/decreases with fitted values)
Problem: Curved pattern (indicates non-linear relationship)

Scale-Location Plot:

Plots square root of absolute standardized residuals against fitted values
Good: Horizontal line with random scatter
Problem: Upward or downward trend indicates changing variance

Common Patterns

Fan Shape (Most Common):

Variance increases as predicted values increase
Often seen when predicting sales (larger values have more variability)

Funnel Shape:

Variance decreases as predicted values increase
Less common but still problematic

Grouped Heteroscedasticity:

Different variance for different categories or time periods
Suggests need for categorical variables or interactions

Interpreting Test Results

Passed Tests (✓)

What it means: No significant heteroscedasticity detected (p ≥ 0.05)

Implications:

Variance of errors is reasonably constant
Standard errors are reliable
Confidence intervals and p-values are valid

Action: No action needed - homoscedasticity assumption is satisfied

Failed Tests (⚠)

What it means: Heteroscedasticity detected (p < 0.05)

Implications:

Standard errors may be incorrect (typically too small)
Confidence intervals may have wrong coverage
Hypothesis tests may be unreliable
Note: Coefficient estimates remain unbiased

Common Causes:

Dependent variable has increasing/decreasing variance
Important variables omitted
Wrong functional form (need transformations)
Outliers affecting variance
Natural heteroscedasticity in the data-generating process

What to Do When Tests Fail

If heteroscedasticity tests fail, try these solutions in order:

1. Transform the Dependent Variable (Most Effective)

Log transformation: Reduces right skewness and stabilizes variance
- Good for: Sales, revenue, spend data
- Effect: Multiplicative relationships become additive
Square root transformation: Moderate variance stabilization
Inverse transformation: For highly skewed data

2. Use Weighted Least Squares (WLS)

Give less weight to observations with higher variance
Requires estimating variance function
Available in advanced statistical software

3. Use Robust Standard Errors

Calculate heteroscedasticity-consistent (HC) standard errors
Corrects standard errors without changing coefficients
Common variants: HC0, HC1, HC2, HC3

4. Add Omitted Variables

Include variables that explain variance patterns
Add interaction terms
Consider non-linear transformations

5. Check for Outliers

Review Influential Points diagnostic
Outliers can create false heteroscedasticity patterns
Consider robust regression methods

6. When Heteroscedasticity is Acceptable

Mild heteroscedasticity with large sample sizes
Focus on coefficient estimates rather than inference
Using robust standard errors in practice
Business decisions not sensitive to exact confidence intervals

Practical Guidelines

Acceptable Scenarios:

Mild heteroscedasticity (p-value between 0.01-0.05)
Large datasets where robust standard errors are used
Predictions are the primary goal (coefficients remain unbiased)
Natural variation in business data

Critical Issues:

Severe heteroscedasticity (p < 0.001)
Clear fan or funnel pattern in residual plots
Small sample sizes requiring precise inference
Using model for confidence interval construction

Example Interpretation

Scenario 1 - Passed:

Breusch-Pagan p-value: 0.28
White test p-value: 0.42
Residual plot shows random scatter with consistent spread

Interpretation: No significant heteroscedasticity detected. The constant variance assumption is satisfied, and standard errors are reliable.

Scenario 2 - Failed:

Breusch-Pagan p-value: 0.008
White test p-value: 0.003
Residual plot shows clear fan shape (increasing variance)

Interpretation: Heteroscedasticity detected. Variance increases with fitted values. Consider log-transforming the KPI or using robust standard errors. If focused on coefficient interpretation rather than p-values, this may be acceptable with appropriate caveats.

Scenario 3 - Severe:

Breusch-Pagan p-value: < 0.001
White test p-value: < 0.001
Strong funnel pattern with extreme variance differences

Interpretation: Severe heteroscedasticity. Model requires transformation. Try log-transforming the dependent variable before proceeding with inference or decision-making.

Marketing Mix Modeling Context

In MMM, heteroscedasticity often appears because:

Sales Variability: Larger sales periods naturally have higher variance

Promotional Effects: Promotions create volatility in certain periods

Seasonal Patterns: Different variance across seasons

Spend Ranges: Different variance at low vs. high spend levels

Log transformation of the KPI is particularly effective in MMM as it:

Stabilizes variance
Makes relationships multiplicative (natural for marketing effects)
Reduces influence of outliers
Provides coefficients interpretable as elasticities

Relationship to Other Assumptions

Heteroscedasticity often co-occurs with:

Non-normality: Changing variance can cause residuals to deviate from normality

Outliers: Extreme values can create both heteroscedasticity and influential points

Non-linearity: Wrong functional form can manifest as heteroscedasticity

After reviewing heteroscedasticity:

Check Residual Normality as heteroscedasticity can affect normality
Review Influential Points to identify outliers causing variance issues
Examine Actual vs Predicted for systematic patterns

PreviousAutocorrelation (Durbin-Watson)NextMulticollinearity (VIF)

Last updated 27 days ago

What Heteroscedasticity Tests Check

Why Constant Variance Matters

Statistical Tests Available

Visual Diagnostics

Common Patterns

Interpreting Test Results

Passed Tests (✓)

Failed Tests (⚠)

What to Do When Tests Fail

Practical Guidelines

Example Interpretation

Marketing Mix Modeling Context

Relationship to Other Assumptions

Related Diagnostics