Actual vs Predicted

What Actual vs Predicted Analysis Shows

Actual vs Predicted analysis evaluates how well your model fits the data by comparing the observed values of your KPI with the values predicted by the model. This provides a comprehensive view of model performance and prediction accuracy.

Purpose: Evaluates overall model fit and prediction accuracy by comparing actual KPI values with model predictions.

Why Model Fit Matters

Good model fit indicates:

Reliable Predictions: The model accurately captures the relationship between marketing and outcomes

Complete Specification: Important drivers have been included

Correct Functional Form: Relationships are properly modeled (linear, saturation, adstock)

Trustworthy Attribution: Decomposition results will be credible

Poor fit suggests missing variables, wrong transformations, or fundamental model misspecification.

Key Performance Metrics

MixModeler calculates three primary metrics to assess fit:

R-Squared (R²)

Definition: Proportion of variance in the KPI explained by the model

Range: 0 to 1 (or 0% to 100%)

Interpretation:

R² Range

Model Quality

Interpretation

> 0.80

Excellent

Model explains >80% of variation

0.70 - 0.80

Good

Acceptable for most business applications

0.50 - 0.70

Moderate

Room for improvement, use with caution

< 0.50

Poor

Significant work needed

Formula: R² = 1 - (SS_residual / SS_total)

Note: Higher R² is better, but very high R² (>0.95) may indicate overfitting

Adjusted R-Squared

Definition: R² adjusted for the number of predictors in the model

Purpose: Penalizes models for including unnecessary variables

Advantage: Better for comparing models with different numbers of variables

Relationship: Always ≤ R², with larger gap when many weak predictors included

Use: Prefer adjusted R² when comparing alternative model specifications

Root Mean Squared Error (RMSE)

Definition: Square root of the average squared prediction error

Formula: RMSE = √(Σ(actual - predicted)² / n)

Units: Same units as your KPI (e.g., dollars, units sold)

Interpretation:

Lower values indicate better fit
Measures typical prediction error
Compare across models (lower is better)
Context-dependent (RMSE of 1000 is good if KPI averages 100,000)

Relative RMSE: RMSE / mean(KPI) × 100 gives percentage error

Mean Absolute Error (MAE)

Definition: Average absolute prediction error

Formula: MAE = Σ|actual - predicted| / n

Units: Same units as your KPI

Interpretation:

Lower values indicate better fit
More robust to outliers than RMSE
Easier to interpret (average error magnitude)

Comparison with RMSE:

RMSE penalizes large errors more heavily
MAE treats all errors equally
If RMSE >> MAE, model has some large errors

Visual Diagnostics

MixModeler provides two key visualizations:

Scatter Plot (Actual vs Predicted)

X-axis: Predicted values Y-axis: Actual values Reference line: 45-degree diagonal (perfect predictions)

Good fit:

Points cluster tightly around diagonal line
No systematic deviation from line
Even spread across the range

Poor fit:

Points scattered far from diagonal
Systematic pattern (curved, clusters)
Wider spread at certain ranges

Time Series Plot

X-axis: Time (observation index or date) Y-axis: KPI values Two lines: Actual (solid) and Predicted (dashed)

Good fit:

Lines track closely throughout time period
Model captures peaks and troughs
No systematic over/under-prediction

Poor fit:

Large gaps between lines
Predicted line misses major movements
Consistent over or under-prediction

Interpreting Test Results

Strong Model Fit (✓)

Characteristics:

R² > 0.70
Low RMSE relative to KPI scale
Scatter plot points close to diagonal
Time series lines track well

Implications:

Model captures the key relationships
Predictions are reliable
Attribution is trustworthy
Can proceed with optimization

Action: Model is ready for business use

Moderate Model Fit (⚠)

Characteristics:

R² between 0.50-0.70
Moderate RMSE
Some scatter around diagonal
Time series generally tracks but with gaps

Implications:

Model captures main effects but misses some variation
Predictions are reasonable but not precise
Attribution gives directional insights
Consider improvements before optimization

Actions to improve:

Add missing variables
Test different transformations
Refine adstock parameters
Include interaction terms

Poor Model Fit (❌)

Characteristics:

R² < 0.50
High RMSE
Scatter plot widely dispersed
Time series lines diverge

Implications:

Model misspecified or missing critical variables
Predictions unreliable
Attribution results questionable
Not suitable for business decisions

Required actions:

Add important omitted variables
Reconsider model structure
Check data quality
Review business understanding

Common Patterns and Issues

Systematic Under-Prediction

Pattern: Predicted line consistently below actual

Possible causes:

Missing positive driver variables
Saturation curves too aggressive
Baseline estimate too low

Solution: Add variables, adjust curves, increase intercept

Systematic Over-Prediction

Pattern: Predicted line consistently above actual

Possible causes:

Including spurious variables
Double-counting effects
Baseline estimate too high

Solution: Remove weak variables, check for overlap, adjust baseline

Good Fit but Misses Peaks

Pattern: Model tracks trends but misses high/low extremes

Possible causes:

Missing promotional or event variables
Insufficient saturation flexibility
Outlier periods

Solution: Add event dummies, adjust curves, investigate extremes

Seasonal Patterns in Errors

Pattern: Residuals show cyclical patterns

Possible causes:

Missing seasonality variables
Autocorrelation issues
Quarterly or monthly effects not captured

Solution: Add seasonal dummies, check autocorrelation diagnostic

Practical Guidelines

Acceptable R² Benchmarks by Industry:

Retail/E-commerce: 0.65-0.85 (high variability, many factors)

CPG/Brand: 0.70-0.90 (stable markets, clear drivers)

B2B/Services: 0.60-0.80 (longer sales cycles, more noise)

These are guidelines - context matters more than arbitrary thresholds

When Lower R² is Acceptable:

Weekly data (more noise than monthly)
Many external factors beyond marketing control
New products or markets with limited history
Focus is on directional insights not precise prediction

When Higher R² is Expected:

Monthly or quarterly aggregation (smooths noise)
Stable mature markets
Strong marketing influence on KPI
Long time series with clear patterns

Example Interpretation

Scenario 1 - Excellent Fit:

R²: 0.82
Adjusted R²: 0.79
RMSE: 2,500 (KPI average: 50,000 = 5% error)
MAE: 1,800
Scatter plot: Tight clustering around diagonal

Interpretation: Excellent model fit. The model explains 82% of KPI variation with low prediction error. Time series shows model captures both trends and fluctuations. Ready for business use and optimization.

Scenario 2 - Good Fit:

R²: 0.73
Adjusted R²: 0.70
RMSE: 4,200 (KPI average: 50,000 = 8.4% error)
MAE: 3,100

Interpretation: Good model fit suitable for most business applications. The model captures major drivers but some variation remains unexplained. Consider adding seasonal variables or testing different saturation curves to improve further.

Scenario 3 - Needs Improvement:

R²: 0.48
Adjusted R²: 0.42
RMSE: 9,500 (KPI average: 50,000 = 19% error)
MAE: 7,200
Scatter plot: Wide dispersion

Interpretation: Poor model fit. Less than half of variation explained. Review model specification, add important variables, and check data quality before using for business decisions.

Relationship to Business Decisions

For Budget Allocation:

Need R² > 0.70 for confident optimization
Lower R² means more uncertainty in ROI estimates

For Forecasting:

RMSE indicates expected forecast error
Use confidence intervals based on RMSE

For Attribution:

Good fit ensures decomposition sums to actual KPI
Poor fit means attribution residuals are large

For ROI Calculation:

Coefficient accuracy depends on model fit
Low R² suggests ROI estimates are imprecise

After reviewing Actual vs Predicted:

Check Residual Normality to see if errors are well-behaved
Review Autocorrelation if time series shows patterns
Examine Influential Points to see which periods fit poorly
Check R² in Model Builder for overall model statistics

PreviousInfluential Points NextInterpreting Test Results

Last updated 27 days ago

What Actual vs Predicted Analysis Shows

Why Model Fit Matters

Key Performance Metrics

R-Squared (R²)

Adjusted R-Squared

Root Mean Squared Error (RMSE)

Mean Absolute Error (MAE)

Visual Diagnostics

Scatter Plot (Actual vs Predicted)

Time Series Plot

Interpreting Test Results

Strong Model Fit (✓)

Moderate Model Fit (⚠)

Poor Model Fit (❌)

Common Patterns and Issues

Systematic Under-Prediction

Systematic Over-Prediction

Good Fit but Misses Peaks

Seasonal Patterns in Errors

Practical Guidelines

Example Interpretation

Relationship to Business Decisions

Related Diagnostics