Actual vs Predicted

What Actual vs Predicted Analysis Shows

Actual vs Predicted analysis evaluates how well your model fits the data by comparing the observed values of your KPI with the values predicted by the model. This provides a comprehensive view of model performance and prediction accuracy.

Purpose: Evaluates overall model fit and prediction accuracy by comparing actual KPI values with model predictions.

Why Model Fit Matters

Good model fit indicates:

Reliable Predictions: The model accurately captures the relationship between marketing and outcomes

Complete Specification: Important drivers have been included

Correct Functional Form: Relationships are properly modeled (linear, saturation, adstock)

Trustworthy Attribution: Decomposition results will be credible

Poor fit suggests missing variables, wrong transformations, or fundamental model misspecification.

Key Performance Metrics

MixModeler calculates three primary metrics to assess fit:

R-Squared (R²)

Definition: Proportion of variance in the KPI explained by the model

Range: 0 to 1 (or 0% to 100%)

Interpretation:

R² Range
Model Quality
Interpretation

> 0.80

Excellent

Model explains >80% of variation

0.70 - 0.80

Good

Acceptable for most business applications

0.50 - 0.70

Moderate

Room for improvement, use with caution

< 0.50

Poor

Significant work needed

Formula: R² = 1 - (SS_residual / SS_total)

Note: Higher R² is better, but very high R² (>0.95) may indicate overfitting

Adjusted R-Squared

Definition: R² adjusted for the number of predictors in the model

Purpose: Penalizes models for including unnecessary variables

Advantage: Better for comparing models with different numbers of variables

Relationship: Always ≤ R², with larger gap when many weak predictors included

Use: Prefer adjusted R² when comparing alternative model specifications

Root Mean Squared Error (RMSE)

Definition: Square root of the average squared prediction error

Formula: RMSE = √(Σ(actual - predicted)² / n)

Units: Same units as your KPI (e.g., dollars, units sold)

Interpretation:

  • Lower values indicate better fit

  • Measures typical prediction error

  • Compare across models (lower is better)

  • Context-dependent (RMSE of 1000 is good if KPI averages 100,000)

Relative RMSE: RMSE / mean(KPI) × 100 gives percentage error

Mean Absolute Error (MAE)

Definition: Average absolute prediction error

Formula: MAE = Σ|actual - predicted| / n

Units: Same units as your KPI

Interpretation:

  • Lower values indicate better fit

  • More robust to outliers than RMSE

  • Easier to interpret (average error magnitude)

Comparison with RMSE:

  • RMSE penalizes large errors more heavily

  • MAE treats all errors equally

  • If RMSE >> MAE, model has some large errors

Visual Diagnostics

MixModeler provides two key visualizations:

Scatter Plot (Actual vs Predicted)

X-axis: Predicted values Y-axis: Actual values Reference line: 45-degree diagonal (perfect predictions)

Good fit:

  • Points cluster tightly around diagonal line

  • No systematic deviation from line

  • Even spread across the range

Poor fit:

  • Points scattered far from diagonal

  • Systematic pattern (curved, clusters)

  • Wider spread at certain ranges

Time Series Plot

X-axis: Time (observation index or date) Y-axis: KPI values Two lines: Actual (solid) and Predicted (dashed)

Good fit:

  • Lines track closely throughout time period

  • Model captures peaks and troughs

  • No systematic over/under-prediction

Poor fit:

  • Large gaps between lines

  • Predicted line misses major movements

  • Consistent over or under-prediction

Interpreting Test Results

Strong Model Fit (✓)

Characteristics:

  • R² > 0.70

  • Low RMSE relative to KPI scale

  • Scatter plot points close to diagonal

  • Time series lines track well

Implications:

  • Model captures the key relationships

  • Predictions are reliable

  • Attribution is trustworthy

  • Can proceed with optimization

Action: Model is ready for business use

Moderate Model Fit (⚠)

Characteristics:

  • R² between 0.50-0.70

  • Moderate RMSE

  • Some scatter around diagonal

  • Time series generally tracks but with gaps

Implications:

  • Model captures main effects but misses some variation

  • Predictions are reasonable but not precise

  • Attribution gives directional insights

  • Consider improvements before optimization

Actions to improve:

  • Add missing variables

  • Test different transformations

  • Refine adstock parameters

  • Include interaction terms

Poor Model Fit (❌)

Characteristics:

  • R² < 0.50

  • High RMSE

  • Scatter plot widely dispersed

  • Time series lines diverge

Implications:

  • Model misspecified or missing critical variables

  • Predictions unreliable

  • Attribution results questionable

  • Not suitable for business decisions

Required actions:

  • Add important omitted variables

  • Reconsider model structure

  • Check data quality

  • Review business understanding

Common Patterns and Issues

Systematic Under-Prediction

Pattern: Predicted line consistently below actual

Possible causes:

  • Missing positive driver variables

  • Saturation curves too aggressive

  • Baseline estimate too low

Solution: Add variables, adjust curves, increase intercept

Systematic Over-Prediction

Pattern: Predicted line consistently above actual

Possible causes:

  • Including spurious variables

  • Double-counting effects

  • Baseline estimate too high

Solution: Remove weak variables, check for overlap, adjust baseline

Good Fit but Misses Peaks

Pattern: Model tracks trends but misses high/low extremes

Possible causes:

  • Missing promotional or event variables

  • Insufficient saturation flexibility

  • Outlier periods

Solution: Add event dummies, adjust curves, investigate extremes

Seasonal Patterns in Errors

Pattern: Residuals show cyclical patterns

Possible causes:

  • Missing seasonality variables

  • Autocorrelation issues

  • Quarterly or monthly effects not captured

Solution: Add seasonal dummies, check autocorrelation diagnostic

Practical Guidelines

Acceptable R² Benchmarks by Industry:

Retail/E-commerce: 0.65-0.85 (high variability, many factors)

CPG/Brand: 0.70-0.90 (stable markets, clear drivers)

B2B/Services: 0.60-0.80 (longer sales cycles, more noise)

These are guidelines - context matters more than arbitrary thresholds

When Lower R² is Acceptable:

  • Weekly data (more noise than monthly)

  • Many external factors beyond marketing control

  • New products or markets with limited history

  • Focus is on directional insights not precise prediction

When Higher R² is Expected:

  • Monthly or quarterly aggregation (smooths noise)

  • Stable mature markets

  • Strong marketing influence on KPI

  • Long time series with clear patterns

Example Interpretation

Scenario 1 - Excellent Fit:

  • R²: 0.82

  • Adjusted R²: 0.79

  • RMSE: 2,500 (KPI average: 50,000 = 5% error)

  • MAE: 1,800

  • Scatter plot: Tight clustering around diagonal

Interpretation: Excellent model fit. The model explains 82% of KPI variation with low prediction error. Time series shows model captures both trends and fluctuations. Ready for business use and optimization.

Scenario 2 - Good Fit:

  • R²: 0.73

  • Adjusted R²: 0.70

  • RMSE: 4,200 (KPI average: 50,000 = 8.4% error)

  • MAE: 3,100

Interpretation: Good model fit suitable for most business applications. The model captures major drivers but some variation remains unexplained. Consider adding seasonal variables or testing different saturation curves to improve further.

Scenario 3 - Needs Improvement:

  • R²: 0.48

  • Adjusted R²: 0.42

  • RMSE: 9,500 (KPI average: 50,000 = 19% error)

  • MAE: 7,200

  • Scatter plot: Wide dispersion

Interpretation: Poor model fit. Less than half of variation explained. Review model specification, add important variables, and check data quality before using for business decisions.

Relationship to Business Decisions

For Budget Allocation:

  • Need R² > 0.70 for confident optimization

  • Lower R² means more uncertainty in ROI estimates

For Forecasting:

  • RMSE indicates expected forecast error

  • Use confidence intervals based on RMSE

For Attribution:

  • Good fit ensures decomposition sums to actual KPI

  • Poor fit means attribution residuals are large

For ROI Calculation:

  • Coefficient accuracy depends on model fit

  • Low R² suggests ROI estimates are imprecise

After reviewing Actual vs Predicted:

  • Check Residual Normality to see if errors are well-behaved

  • Review Autocorrelation if time series shows patterns

  • Examine Influential Points to see which periods fit poorly

  • Check R² in Model Builder for overall model statistics

Last updated