Actual vs Predicted
What Actual vs Predicted Analysis Shows
Actual vs Predicted analysis evaluates how well your model fits the data by comparing the observed values of your KPI with the values predicted by the model. This provides a comprehensive view of model performance and prediction accuracy.
Purpose: Evaluates overall model fit and prediction accuracy by comparing actual KPI values with model predictions.
Why Model Fit Matters
Good model fit indicates:
Reliable Predictions: The model accurately captures the relationship between marketing and outcomes
Complete Specification: Important drivers have been included
Correct Functional Form: Relationships are properly modeled (linear, saturation, adstock)
Trustworthy Attribution: Decomposition results will be credible
Poor fit suggests missing variables, wrong transformations, or fundamental model misspecification.
Key Performance Metrics
MixModeler calculates three primary metrics to assess fit:
R-Squared (R²)
Definition: Proportion of variance in the KPI explained by the model
Range: 0 to 1 (or 0% to 100%)
Interpretation:
> 0.80
Excellent
Model explains >80% of variation
0.70 - 0.80
Good
Acceptable for most business applications
0.50 - 0.70
Moderate
Room for improvement, use with caution
< 0.50
Poor
Significant work needed
Formula: R² = 1 - (SS_residual / SS_total)
Note: Higher R² is better, but very high R² (>0.95) may indicate overfitting
Adjusted R-Squared
Definition: R² adjusted for the number of predictors in the model
Purpose: Penalizes models for including unnecessary variables
Advantage: Better for comparing models with different numbers of variables
Relationship: Always ≤ R², with larger gap when many weak predictors included
Use: Prefer adjusted R² when comparing alternative model specifications
Root Mean Squared Error (RMSE)
Definition: Square root of the average squared prediction error
Formula: RMSE = √(Σ(actual - predicted)² / n)
Units: Same units as your KPI (e.g., dollars, units sold)
Interpretation:
- Lower values indicate better fit 
- Measures typical prediction error 
- Compare across models (lower is better) 
- Context-dependent (RMSE of 1000 is good if KPI averages 100,000) 
Relative RMSE: RMSE / mean(KPI) × 100 gives percentage error
Mean Absolute Error (MAE)
Definition: Average absolute prediction error
Formula: MAE = Σ|actual - predicted| / n
Units: Same units as your KPI
Interpretation:
- Lower values indicate better fit 
- More robust to outliers than RMSE 
- Easier to interpret (average error magnitude) 
Comparison with RMSE:
- RMSE penalizes large errors more heavily 
- MAE treats all errors equally 
- If RMSE >> MAE, model has some large errors 
Visual Diagnostics
MixModeler provides two key visualizations:
Scatter Plot (Actual vs Predicted)
X-axis: Predicted values Y-axis: Actual values Reference line: 45-degree diagonal (perfect predictions)
Good fit:
- Points cluster tightly around diagonal line 
- No systematic deviation from line 
- Even spread across the range 
Poor fit:
- Points scattered far from diagonal 
- Systematic pattern (curved, clusters) 
- Wider spread at certain ranges 
Time Series Plot
X-axis: Time (observation index or date) Y-axis: KPI values Two lines: Actual (solid) and Predicted (dashed)
Good fit:
- Lines track closely throughout time period 
- Model captures peaks and troughs 
- No systematic over/under-prediction 
Poor fit:
- Large gaps between lines 
- Predicted line misses major movements 
- Consistent over or under-prediction 
Interpreting Test Results
Strong Model Fit (✓)
Characteristics:
- R² > 0.70 
- Low RMSE relative to KPI scale 
- Scatter plot points close to diagonal 
- Time series lines track well 
Implications:
- Model captures the key relationships 
- Predictions are reliable 
- Attribution is trustworthy 
- Can proceed with optimization 
Action: Model is ready for business use
Moderate Model Fit (⚠)
Characteristics:
- R² between 0.50-0.70 
- Moderate RMSE 
- Some scatter around diagonal 
- Time series generally tracks but with gaps 
Implications:
- Model captures main effects but misses some variation 
- Predictions are reasonable but not precise 
- Attribution gives directional insights 
- Consider improvements before optimization 
Actions to improve:
- Add missing variables 
- Test different transformations 
- Refine adstock parameters 
- Include interaction terms 
Poor Model Fit (❌)
Characteristics:
- R² < 0.50 
- High RMSE 
- Scatter plot widely dispersed 
- Time series lines diverge 
Implications:
- Model misspecified or missing critical variables 
- Predictions unreliable 
- Attribution results questionable 
- Not suitable for business decisions 
Required actions:
- Add important omitted variables 
- Reconsider model structure 
- Check data quality 
- Review business understanding 
Common Patterns and Issues
Systematic Under-Prediction
Pattern: Predicted line consistently below actual
Possible causes:
- Missing positive driver variables 
- Saturation curves too aggressive 
- Baseline estimate too low 
Solution: Add variables, adjust curves, increase intercept
Systematic Over-Prediction
Pattern: Predicted line consistently above actual
Possible causes:
- Including spurious variables 
- Double-counting effects 
- Baseline estimate too high 
Solution: Remove weak variables, check for overlap, adjust baseline
Good Fit but Misses Peaks
Pattern: Model tracks trends but misses high/low extremes
Possible causes:
- Missing promotional or event variables 
- Insufficient saturation flexibility 
- Outlier periods 
Solution: Add event dummies, adjust curves, investigate extremes
Seasonal Patterns in Errors
Pattern: Residuals show cyclical patterns
Possible causes:
- Missing seasonality variables 
- Autocorrelation issues 
- Quarterly or monthly effects not captured 
Solution: Add seasonal dummies, check autocorrelation diagnostic
Practical Guidelines
Acceptable R² Benchmarks by Industry:
Retail/E-commerce: 0.65-0.85 (high variability, many factors)
CPG/Brand: 0.70-0.90 (stable markets, clear drivers)
B2B/Services: 0.60-0.80 (longer sales cycles, more noise)
These are guidelines - context matters more than arbitrary thresholds
When Lower R² is Acceptable:
- Weekly data (more noise than monthly) 
- Many external factors beyond marketing control 
- New products or markets with limited history 
- Focus is on directional insights not precise prediction 
When Higher R² is Expected:
- Monthly or quarterly aggregation (smooths noise) 
- Stable mature markets 
- Strong marketing influence on KPI 
- Long time series with clear patterns 
Example Interpretation
Scenario 1 - Excellent Fit:
- R²: 0.82 
- Adjusted R²: 0.79 
- RMSE: 2,500 (KPI average: 50,000 = 5% error) 
- MAE: 1,800 
- Scatter plot: Tight clustering around diagonal 
Interpretation: Excellent model fit. The model explains 82% of KPI variation with low prediction error. Time series shows model captures both trends and fluctuations. Ready for business use and optimization.
Scenario 2 - Good Fit:
- R²: 0.73 
- Adjusted R²: 0.70 
- RMSE: 4,200 (KPI average: 50,000 = 8.4% error) 
- MAE: 3,100 
Interpretation: Good model fit suitable for most business applications. The model captures major drivers but some variation remains unexplained. Consider adding seasonal variables or testing different saturation curves to improve further.
Scenario 3 - Needs Improvement:
- R²: 0.48 
- Adjusted R²: 0.42 
- RMSE: 9,500 (KPI average: 50,000 = 19% error) 
- MAE: 7,200 
- Scatter plot: Wide dispersion 
Interpretation: Poor model fit. Less than half of variation explained. Review model specification, add important variables, and check data quality before using for business decisions.
Relationship to Business Decisions
For Budget Allocation:
- Need R² > 0.70 for confident optimization 
- Lower R² means more uncertainty in ROI estimates 
For Forecasting:
- RMSE indicates expected forecast error 
- Use confidence intervals based on RMSE 
For Attribution:
- Good fit ensures decomposition sums to actual KPI 
- Poor fit means attribution residuals are large 
For ROI Calculation:
- Coefficient accuracy depends on model fit 
- Low R² suggests ROI estimates are imprecise 
Related Diagnostics
After reviewing Actual vs Predicted:
- Check Residual Normality to see if errors are well-behaved 
- Review Autocorrelation if time series shows patterns 
- Examine Influential Points to see which periods fit poorly 
- Check R² in Model Builder for overall model statistics 
Last updated