Adding/Removing Variables
Master the Core Model Building Operations
Overview
Adding and removing variables is the heart of model building. Every MMM starts with an empty model and grows through systematic variable selection. Understanding how to add/remove variables efficiently is essential for building high-quality models.
Key Principle: Add variables that improve explanatory power while maintaining statistical validity. Remove variables that are non-significant or cause issues.
Adding Variables to Your Model
Step-by-Step Process
Step 1: Select Variables to Add
In the Available Variables panel (right side):
Browse or search for variables
Check boxes next to variables you want to add
Multiple selection allowed
Selected count shown at bottom
Step 2: Configure Adstock (Optional)
For media variables with carryover effects:
Set adstock rate (0-100%) for each selected variable
Default is 0% (no carryover)
Common values:
TV: 60-80%
Radio: 50-70%
Digital: 20-40%
Search/Email: 0-10%
Step 3: Click "Add Variables" Button
Button located at bottom of Available Variables panel
Step 4: Review Preview Dialog
MixModeler shows a comparison table:
Old Model columns:
Variables currently in model
Their current coefficients
Their current t-statistics
New Model columns:
All variables (existing + new)
New coefficients after addition
New t-statistics
Change columns:
Coefficient change %
T-stat change
New rows:
Show statistics for newly added variables
Coefficient, t-stat, p-value
Step 5: Evaluate Preview
Look for:
✅ New variables are significant:
P-value < 0.05
T-stat > 2.0
Correct sign (positive for marketing spend usually)
✅ Existing coefficients stable:
Changes < 10%
Signs remain same
T-stats remain significant
✅ R² improvement:
Should increase with meaningful additions
Even small improvement (0.01-0.02) can be valuable
⚠️ Warning signs:
Existing coefficients change > 30%
Signs flip unexpectedly
T-stats drop below 2.0
New variables not significant (p > 0.10)
Step 6: Confirm or Cancel
If preview looks good:
Click "Add Variables" or "Confirm"
Variables permanently added
Model refits automatically
Results update in interface
If preview shows issues:
Click "Cancel"
No changes made
Try different variable combination
Investigate cause of issues
Adding Variables One at a Time
Recommended for:
First few variables in new model
Testing hypothesis about specific variable
When unsure about variable relevance
Avoiding multicollinearity
Process:
Select ONE variable
Add and review impact
If good, keep and move to next
If bad, remove and try different variable
Benefits:
Clear attribution of R² improvement
Easy to identify problematic variables
Build confidence in model structure
Adding Multiple Variables Together
Appropriate for:
Related variables (seasonal dummies)
Variables you're confident about
Later stages of model building
Variables from same category
Caution:
Harder to isolate individual effects
If issues arise, need to test variables separately
More likely to introduce multicollinearity
Example: Adding seasonal dummies Select all 11 month dummies at once (December is reference)
Makes sense as a group
All should be added together
Interpret relative to reference month
Adstock Configuration During Addition
What is adstock: Carryover effect where marketing impact persists beyond exposure period
When to apply:
TV advertising (strong carryover)
Radio advertising (moderate carryover)
Brand campaigns (lasting awareness)
Any media with delayed/persistent effects
When NOT to apply:
Search (immediate effect)
Email (immediate effect)
Price variables (contemporaneous)
External factors (weather, events)
How to set rate:
0% = No carryover (immediate effect only)
50% = Half of effect carries to next period
70% = Strong carryover (typical TV)
90% = Very persistent (rare, use cautiously)
Variable naming with adstock: Original: TV_Spend
With adstock: TV_Spend_adstock_70 (for 70% rate)
Both variables available:
Original variable still exists in data
Adstock version created as new variable
You choose which to include in model
Don't include both (perfect multicollinearity)
Fixed Coefficients (OLS Only)
Purpose: Manually specify coefficient value instead of estimating
Use cases:
Sensitivity analysis (what if TV coefficient was X?)
Incorporating external constraints
Testing specific hypotheses
Advanced modeling techniques
How to set:
Select variable to add
Change "Coefficient Type" from Floating to Fixed
Enter desired coefficient value
Add variable
Effect:
Variable included in model
Coefficient NOT estimated by regression
Fixed at your specified value
Other coefficients estimated given fixed value
Caution:
Advanced feature
Can produce unrealistic models
Use only if you have strong rationale
Most users should leave as "Floating"
Removing Variables from Your Model
When to Remove Variables
Statistical reasons:
❌ Not significant: P-value > 0.10 ❌ Wrong sign: Negative coefficient for marketing spend ❌ High VIF: > 10 indicates multicollinearity ❌ Unstable: Coefficient changes dramatically across specifications
Business reasons:
❌ No longer relevant: Campaign ended, variable obsolete ❌ Data quality issues: Unreliable measurement ❌ Redundant: Better variable available ❌ Over-complex: Simplifying model
Step-by-Step Removal Process
Step 1: Select Variables to Remove
In Current Model Variables panel (left side):
Check boxes next to variables you want to remove
Multiple selection allowed
Cannot remove constant term
Step 2: Click "Remove Variables" Button
Button located at bottom of Current Model Variables panel
Step 3: Review Preview Dialog
Shows model performance without removed variables:
Columns:
Variable names
Old coefficients (with removed variables)
New coefficients (without removed variables)
Coefficient changes %
T-stat changes
Rows for removed variables:
Grayed out or marked
Will no longer be in model
Rows for remaining variables:
Show updated statistics
May change when variables removed
Step 4: Evaluate Preview
Good removal if:
✅ R² decreases < 2% ✅ Remaining coefficients stable ✅ Remaining t-stats remain significant ✅ Model simpler, easier to interpret
Problematic removal if:
⚠️ R² drops > 5% ⚠️ Remaining coefficients change dramatically ⚠️ Signs flip for remaining variables ⚠️ Model fit unacceptable
Step 5: Confirm or Cancel
If preview acceptable:
Click "Remove" or "Confirm"
Variables permanently removed
Model refits without them
Results update
If too much impact:
Click "Cancel"
Variables remain in model
Reconsider removal strategy
Removing Multiple Variables
Safe scenarios:
All non-significant variables together
All variables from same category (if all problematic)
Variables you're confident are harmful
Risky scenarios:
Removing many variables at once
Uncertain which ones are truly problematic
Complex interdependencies
Better approach: Remove one at a time, especially if:
Some variables marginally significant
Uncertain about interaction effects
Want to understand individual impact
Strategic Removal for Model Improvement
Scenario 1: Simplifying overfitted model
Problem: Model has 30 variables, many non-significant
Solution:
Sort by p-value (highest first)
Remove variables with p > 0.10, one at a time
After each removal, check if others become significant
Stop when all remaining variables significant
Scenario 2: Resolving multicollinearity
Problem: High VIF (>10) for several variables
Solution:
Identify correlated variables (VIF testing)
Choose one to keep (most theoretically important)
Remove others
Check if VIF drops below 10
If not, remove additional correlated variables
Scenario 3: Wrong coefficient signs
Problem: Variable has unexpected sign (e.g., negative for marketing spend)
Solution:
Remove variable with wrong sign
Check if sign corrects in absence of other variables
If sign remains wrong, variable genuinely doesn't fit
Investigate data quality or specification issues
Preview System Details
Why Preview Matters
Prevents mistakes:
See impact before committing
Avoid accidentally breaking model
Understand changes before they're permanent
Enables informed decisions:
Compare options
Weigh tradeoffs (significance vs R²)
Make data-driven choices
Saves time:
Don't need to undo changes
Test hypotheses risk-free
Learn from previews without commitment
What Preview Shows
For Additions:
New variables section:
Coefficient estimate
Standard error
T-statistic
P-value
95% confidence interval
Existing variables section:
Old coefficient → New coefficient
Change percentage
Old t-stat → New t-stat
Impact of addition
Model statistics:
Old R² → New R²
R² improvement
Adjusted R² change
For Removals:
Removed variables section:
Final statistics before removal
Marked as "Will be removed"
Remaining variables section:
Old coefficients → New coefficients
Changes from removal
New significance levels
Model statistics:
R² before → R² after
Impact of removal
Interpreting Preview Results
Coefficient changes:
< 5%: Very stable, good
5-10%: Minor change, acceptable
10-30%: Moderate change, investigate
> 30%: Large change, concerning
T-stat changes:
Increase: Variable became more reliable (good)
Decrease slightly: Minor impact (acceptable)
Drop below 2.0: Lost significance (concerning)
Sign flip: Serious structural issue (investigate)
R² changes:
Increase 0.01-0.05: Meaningful improvement
Increase > 0.05: Substantial improvement
Decrease < 0.02: Acceptable loss
Decrease > 0.05: Significant information loss
Best Practices
Building New Models
Phase 1: Foundation (R² 40-60%)
Add trend/seasonality if needed
Add strongest 1-2 marketing channels
Verify signs and significance
Establish baseline
Phase 2: Core Marketing (R² 60-75%)
Add remaining marketing channels
One at a time initially
Configure adstock for each
Remove any non-significant
Phase 3: Controls (R² 70-85%)
Add price, promotions
Add external factors
Add competitive variables
Refine and remove weak variables
Phase 4: Polish (R² 75-90%)
Apply transformations (curves)
Test interactions if relevant
Final removal of marginally significant
Validate diagnostics
Variable Selection Principles
Parsimony: Prefer simpler models
Simpler models:
Easier to interpret
More stable
Better for optimization
Lower risk of overfitting
Occam's Razor: Don't add complexity without improvement
Only add variables if:
Significantly improve R²
Statistically significant
Make business sense
Theoretical grounding: Include variables for reasons
Good reasons:
Marketing theory (media affects sales)
Business knowledge (price affects demand)
Domain expertise (seasonality matters)
Bad reasons:
"It increased R² by 0.001"
"Let's try everything"
"More variables = better"
Common Mistakes to Avoid
❌ Adding all variables at once
Can't isolate effects
Hard to debug issues
Likely multicollinearity
✅ Instead: Add systematically, test impact
❌ Keeping non-significant variables
Reduces statistical power
Inflates standard errors
Overcomplicates model
✅ Instead: Remove p > 0.10 unless strong business reason
❌ Ignoring coefficient signs
Marketing spend shouldn't reduce sales
Price increase shouldn't increase sales
Violates logic
✅ Instead: Remove variables with wrong signs, investigate cause
❌ Removing variables to maximize R²
R² isn't everything
Can remove important controls
Leads to biased estimates
✅ Instead: Balance R², significance, and interpretation
Troubleshooting
Preview shows dramatic coefficient changes
Cause: Multicollinearity between new and existing variables
Solution:
Check correlation between new and existing variables
Use Variable Testing → VIF Analysis
Choose one variable to keep, remove others
Consider creating weighted/combined variable
Can't add variable - already exists
Cause: Variable name already in model
Solution:
Check if adstock version exists (TV_Spend vs TV_Spend_adstock_70)
Different adstock rates create different variables
Remove existing version before adding new one
Model fit fails after addition
Cause: Perfect multicollinearity or singular matrix
Solution:
Remove last added variable
Check if new variable is exact linear combination of existing
Ensure you're not including both original and transformed versions
R² doesn't change after addition
Cause: Variable is redundant or perfectly explained by existing variables
Solution:
Variable adds no information
Remove it
Find different variable to test
Key Takeaways
Add variables systematically to understand individual impact
Preview changes before committing to avoid mistakes
Remove non-significant variables (p > 0.10) to maintain model quality
Configure adstock for media variables during addition
Watch for coefficient stability - changes > 30% indicate issues
Build iteratively: core → marketing → controls → refinement
Simpler models are better - only add variables that meaningfully improve model
Use preview system to test hypotheses risk-free before committing
Last updated