Adding/Removing Variables

Master the Core Model Building Operations

Overview

Adding and removing variables is the heart of model building. Every MMM starts with an empty model and grows through systematic variable selection. Understanding how to add/remove variables efficiently is essential for building high-quality models.

Key Principle: Add variables that improve explanatory power while maintaining statistical validity. Remove variables that are non-significant or cause issues.

Adding Variables to Your Model

Step-by-Step Process

Step 1: Select Variables to Add

In the Available Variables panel (right side):

Browse or search for variables
Check boxes next to variables you want to add
Multiple selection allowed
Selected count shown at bottom

Step 2: Configure Adstock (Optional)

For media variables with carryover effects:

Set adstock rate (0-100%) for each selected variable
Default is 0% (no carryover)
Common values:
- TV: 60-80%
- Radio: 50-70%
- Digital: 20-40%
- Search/Email: 0-10%

Step 3: Click "Add Variables" Button

Button located at bottom of Available Variables panel

Step 4: Review Preview Dialog

MixModeler shows a comparison table:

Old Model columns:

Variables currently in model
Their current coefficients
Their current t-statistics

New Model columns:

All variables (existing + new)
New coefficients after addition
New t-statistics

Change columns:

Coefficient change %
T-stat change

New rows:

Show statistics for newly added variables
Coefficient, t-stat, p-value

Step 5: Evaluate Preview

Look for:

✅ New variables are significant:

P-value < 0.05
T-stat > 2.0
Correct sign (positive for marketing spend usually)

✅ Existing coefficients stable:

Changes < 10%
Signs remain same
T-stats remain significant

✅ R² improvement:

Should increase with meaningful additions
Even small improvement (0.01-0.02) can be valuable

⚠️ Warning signs:

Existing coefficients change > 30%
Signs flip unexpectedly
T-stats drop below 2.0
New variables not significant (p > 0.10)

Step 6: Confirm or Cancel

If preview looks good:

Click "Add Variables" or "Confirm"
Variables permanently added
Model refits automatically
Results update in interface

If preview shows issues:

Click "Cancel"
No changes made
Try different variable combination
Investigate cause of issues

Adding Variables One at a Time

Recommended for:

First few variables in new model
Testing hypothesis about specific variable
When unsure about variable relevance
Avoiding multicollinearity

Process:

Select ONE variable
Add and review impact
If good, keep and move to next
If bad, remove and try different variable

Benefits:

Clear attribution of R² improvement
Easy to identify problematic variables
Build confidence in model structure

Adding Multiple Variables Together

Appropriate for:

Related variables (seasonal dummies)
Variables you're confident about
Later stages of model building
Variables from same category

Caution:

Harder to isolate individual effects
If issues arise, need to test variables separately
More likely to introduce multicollinearity

Example: Adding seasonal dummies Select all 11 month dummies at once (December is reference)

Makes sense as a group
All should be added together
Interpret relative to reference month

Adstock Configuration During Addition

What is adstock: Carryover effect where marketing impact persists beyond exposure period

When to apply:

TV advertising (strong carryover)
Radio advertising (moderate carryover)
Brand campaigns (lasting awareness)
Any media with delayed/persistent effects

When NOT to apply:

Search (immediate effect)
Email (immediate effect)
Price variables (contemporaneous)
External factors (weather, events)

How to set rate:

0% = No carryover (immediate effect only)
50% = Half of effect carries to next period
70% = Strong carryover (typical TV)
90% = Very persistent (rare, use cautiously)

Variable naming with adstock: Original: TV_Spend With adstock: TV_Spend_adstock_70 (for 70% rate)

Both variables available:

Original variable still exists in data
Adstock version created as new variable
You choose which to include in model
Don't include both (perfect multicollinearity)

Fixed Coefficients (OLS Only)

Purpose: Manually specify coefficient value instead of estimating

Use cases:

Sensitivity analysis (what if TV coefficient was X?)
Incorporating external constraints
Testing specific hypotheses
Advanced modeling techniques

How to set:

Select variable to add
Change "Coefficient Type" from Floating to Fixed
Enter desired coefficient value
Add variable

Effect:

Variable included in model
Coefficient NOT estimated by regression
Fixed at your specified value
Other coefficients estimated given fixed value

Caution:

Advanced feature
Can produce unrealistic models
Use only if you have strong rationale
Most users should leave as "Floating"

Removing Variables from Your Model

When to Remove Variables

Statistical reasons:

❌ Not significant: P-value > 0.10 ❌ Wrong sign: Negative coefficient for marketing spend ❌ High VIF: > 10 indicates multicollinearity ❌ Unstable: Coefficient changes dramatically across specifications

Business reasons:

❌ No longer relevant: Campaign ended, variable obsolete ❌ Data quality issues: Unreliable measurement ❌ Redundant: Better variable available ❌ Over-complex: Simplifying model

Step-by-Step Removal Process

Step 1: Select Variables to Remove

In Current Model Variables panel (left side):

Check boxes next to variables you want to remove
Multiple selection allowed
Cannot remove constant term

Step 2: Click "Remove Variables" Button

Button located at bottom of Current Model Variables panel

Step 3: Review Preview Dialog

Shows model performance without removed variables:

Columns:

Variable names
Old coefficients (with removed variables)
New coefficients (without removed variables)
Coefficient changes %
T-stat changes

Rows for removed variables:

Grayed out or marked
Will no longer be in model

Rows for remaining variables:

Show updated statistics
May change when variables removed

Step 4: Evaluate Preview

Good removal if:

✅ R² decreases < 2% ✅ Remaining coefficients stable ✅ Remaining t-stats remain significant ✅ Model simpler, easier to interpret

Problematic removal if:

⚠️ R² drops > 5% ⚠️ Remaining coefficients change dramatically ⚠️ Signs flip for remaining variables ⚠️ Model fit unacceptable

Step 5: Confirm or Cancel

If preview acceptable:

Click "Remove" or "Confirm"
Variables permanently removed
Model refits without them
Results update

If too much impact:

Click "Cancel"
Variables remain in model
Reconsider removal strategy

Removing Multiple Variables

Safe scenarios:

All non-significant variables together
All variables from same category (if all problematic)
Variables you're confident are harmful

Risky scenarios:

Removing many variables at once
Uncertain which ones are truly problematic
Complex interdependencies

Better approach: Remove one at a time, especially if:

Some variables marginally significant
Uncertain about interaction effects
Want to understand individual impact

Strategic Removal for Model Improvement

Scenario 1: Simplifying overfitted model

Problem: Model has 30 variables, many non-significant

Solution:

Sort by p-value (highest first)
Remove variables with p > 0.10, one at a time
After each removal, check if others become significant
Stop when all remaining variables significant

Scenario 2: Resolving multicollinearity

Problem: High VIF (>10) for several variables

Solution:

Identify correlated variables (VIF testing)
Choose one to keep (most theoretically important)
Remove others
Check if VIF drops below 10
If not, remove additional correlated variables

Scenario 3: Wrong coefficient signs

Problem: Variable has unexpected sign (e.g., negative for marketing spend)

Solution:

Remove variable with wrong sign
Check if sign corrects in absence of other variables
If sign remains wrong, variable genuinely doesn't fit
Investigate data quality or specification issues

Preview System Details

Why Preview Matters

Prevents mistakes:

See impact before committing
Avoid accidentally breaking model
Understand changes before they're permanent

Enables informed decisions:

Compare options
Weigh tradeoffs (significance vs R²)
Make data-driven choices

Saves time:

Don't need to undo changes
Test hypotheses risk-free
Learn from previews without commitment

What Preview Shows

For Additions:

New variables section:

Coefficient estimate
Standard error
T-statistic
P-value
95% confidence interval

Existing variables section:

Old coefficient → New coefficient
Change percentage
Old t-stat → New t-stat
Impact of addition

Model statistics:

Old R² → New R²
R² improvement
Adjusted R² change

For Removals:

Removed variables section:

Final statistics before removal
Marked as "Will be removed"

Remaining variables section:

Old coefficients → New coefficients
Changes from removal
New significance levels

Model statistics:

R² before → R² after
Impact of removal

Interpreting Preview Results

Coefficient changes:

< 5%: Very stable, good
5-10%: Minor change, acceptable
10-30%: Moderate change, investigate
> 30%: Large change, concerning

T-stat changes:

Increase: Variable became more reliable (good)
Decrease slightly: Minor impact (acceptable)
Drop below 2.0: Lost significance (concerning)
Sign flip: Serious structural issue (investigate)

R² changes:

Increase 0.01-0.05: Meaningful improvement
Increase > 0.05: Substantial improvement
Decrease < 0.02: Acceptable loss
Decrease > 0.05: Significant information loss

Best Practices

Building New Models

Phase 1: Foundation (R² 40-60%)

Add trend/seasonality if needed
Add strongest 1-2 marketing channels
Verify signs and significance
Establish baseline

Phase 2: Core Marketing (R² 60-75%)

Add remaining marketing channels
One at a time initially
Configure adstock for each
Remove any non-significant

Phase 3: Controls (R² 70-85%)

Add price, promotions
Add external factors
Add competitive variables
Refine and remove weak variables

Phase 4: Polish (R² 75-90%)

Apply transformations (curves)
Test interactions if relevant
Final removal of marginally significant
Validate diagnostics

Variable Selection Principles

Parsimony: Prefer simpler models

Simpler models:

Easier to interpret
More stable
Better for optimization
Lower risk of overfitting

Occam's Razor: Don't add complexity without improvement

Only add variables if:

Significantly improve R²
Statistically significant
Make business sense

Theoretical grounding: Include variables for reasons

Good reasons:

Marketing theory (media affects sales)
Business knowledge (price affects demand)
Domain expertise (seasonality matters)

Bad reasons:

"It increased R² by 0.001"
"Let's try everything"
"More variables = better"

Common Mistakes to Avoid

❌ Adding all variables at once

Can't isolate effects
Hard to debug issues
Likely multicollinearity

✅ Instead: Add systematically, test impact

❌ Keeping non-significant variables

Reduces statistical power
Inflates standard errors
Overcomplicates model

✅ Instead: Remove p > 0.10 unless strong business reason

❌ Ignoring coefficient signs

Marketing spend shouldn't reduce sales
Price increase shouldn't increase sales
Violates logic

✅ Instead: Remove variables with wrong signs, investigate cause

❌ Removing variables to maximize R²

R² isn't everything
Can remove important controls
Leads to biased estimates

✅ Instead: Balance R², significance, and interpretation

Troubleshooting

Preview shows dramatic coefficient changes

Cause: Multicollinearity between new and existing variables

Solution:

Check correlation between new and existing variables
Use Variable Testing → VIF Analysis
Choose one variable to keep, remove others
Consider creating weighted/combined variable

Can't add variable - already exists

Cause: Variable name already in model

Solution:

Check if adstock version exists (TV_Spend vs TV_Spend_adstock_70)
Different adstock rates create different variables
Remove existing version before adding new one

Model fit fails after addition

Cause: Perfect multicollinearity or singular matrix

Solution:

Remove last added variable
Check if new variable is exact linear combination of existing
Ensure you're not including both original and transformed versions

R² doesn't change after addition

Cause: Variable is redundant or perfectly explained by existing variables

Solution:

Variable adds no information
Remove it
Find different variable to test

Key Takeaways

Add variables systematically to understand individual impact
Preview changes before committing to avoid mistakes
Remove non-significant variables (p > 0.10) to maintain model quality
Configure adstock for media variables during addition
Watch for coefficient stability - changes > 30% indicate issues
Build iteratively: core → marketing → controls → refinement
Simpler models are better - only add variables that meaningfully improve model
Use preview system to test hypotheses risk-free before committing

PreviousModel Builder Interface NextOLS vs Bayesian Selection

Last updated 27 days ago