Adding/Removing Variables

Master the Core Model Building Operations

Overview

Adding and removing variables is the heart of model building. Every MMM starts with an empty model and grows through systematic variable selection. Understanding how to add/remove variables efficiently is essential for building high-quality models.

Key Principle: Add variables that improve explanatory power while maintaining statistical validity. Remove variables that are non-significant or cause issues.

Adding Variables to Your Model

Step-by-Step Process

Step 1: Select Variables to Add

In the Available Variables panel (right side):

  1. Browse or search for variables

  2. Check boxes next to variables you want to add

  3. Multiple selection allowed

  4. Selected count shown at bottom

Step 2: Configure Adstock (Optional)

For media variables with carryover effects:

  1. Set adstock rate (0-100%) for each selected variable

  2. Default is 0% (no carryover)

  3. Common values:

    • TV: 60-80%

    • Radio: 50-70%

    • Digital: 20-40%

    • Search/Email: 0-10%

Step 3: Click "Add Variables" Button

Button located at bottom of Available Variables panel

Step 4: Review Preview Dialog

MixModeler shows a comparison table:

Old Model columns:

  • Variables currently in model

  • Their current coefficients

  • Their current t-statistics

New Model columns:

  • All variables (existing + new)

  • New coefficients after addition

  • New t-statistics

Change columns:

  • Coefficient change %

  • T-stat change

New rows:

  • Show statistics for newly added variables

  • Coefficient, t-stat, p-value

Step 5: Evaluate Preview

Look for:

New variables are significant:

  • P-value < 0.05

  • T-stat > 2.0

  • Correct sign (positive for marketing spend usually)

Existing coefficients stable:

  • Changes < 10%

  • Signs remain same

  • T-stats remain significant

R² improvement:

  • Should increase with meaningful additions

  • Even small improvement (0.01-0.02) can be valuable

⚠️ Warning signs:

  • Existing coefficients change > 30%

  • Signs flip unexpectedly

  • T-stats drop below 2.0

  • New variables not significant (p > 0.10)

Step 6: Confirm or Cancel

If preview looks good:

  • Click "Add Variables" or "Confirm"

  • Variables permanently added

  • Model refits automatically

  • Results update in interface

If preview shows issues:

  • Click "Cancel"

  • No changes made

  • Try different variable combination

  • Investigate cause of issues

Adding Variables One at a Time

Recommended for:

  • First few variables in new model

  • Testing hypothesis about specific variable

  • When unsure about variable relevance

  • Avoiding multicollinearity

Process:

  1. Select ONE variable

  2. Add and review impact

  3. If good, keep and move to next

  4. If bad, remove and try different variable

Benefits:

  • Clear attribution of R² improvement

  • Easy to identify problematic variables

  • Build confidence in model structure

Adding Multiple Variables Together

Appropriate for:

  • Related variables (seasonal dummies)

  • Variables you're confident about

  • Later stages of model building

  • Variables from same category

Caution:

  • Harder to isolate individual effects

  • If issues arise, need to test variables separately

  • More likely to introduce multicollinearity

Example: Adding seasonal dummies Select all 11 month dummies at once (December is reference)

  • Makes sense as a group

  • All should be added together

  • Interpret relative to reference month

Adstock Configuration During Addition

What is adstock: Carryover effect where marketing impact persists beyond exposure period

When to apply:

  • TV advertising (strong carryover)

  • Radio advertising (moderate carryover)

  • Brand campaigns (lasting awareness)

  • Any media with delayed/persistent effects

When NOT to apply:

  • Search (immediate effect)

  • Email (immediate effect)

  • Price variables (contemporaneous)

  • External factors (weather, events)

How to set rate:

  • 0% = No carryover (immediate effect only)

  • 50% = Half of effect carries to next period

  • 70% = Strong carryover (typical TV)

  • 90% = Very persistent (rare, use cautiously)

Variable naming with adstock: Original: TV_Spend With adstock: TV_Spend_adstock_70 (for 70% rate)

Both variables available:

  • Original variable still exists in data

  • Adstock version created as new variable

  • You choose which to include in model

  • Don't include both (perfect multicollinearity)

Fixed Coefficients (OLS Only)

Purpose: Manually specify coefficient value instead of estimating

Use cases:

  • Sensitivity analysis (what if TV coefficient was X?)

  • Incorporating external constraints

  • Testing specific hypotheses

  • Advanced modeling techniques

How to set:

  1. Select variable to add

  2. Change "Coefficient Type" from Floating to Fixed

  3. Enter desired coefficient value

  4. Add variable

Effect:

  • Variable included in model

  • Coefficient NOT estimated by regression

  • Fixed at your specified value

  • Other coefficients estimated given fixed value

Caution:

  • Advanced feature

  • Can produce unrealistic models

  • Use only if you have strong rationale

  • Most users should leave as "Floating"

Removing Variables from Your Model

When to Remove Variables

Statistical reasons:

Not significant: P-value > 0.10 ❌ Wrong sign: Negative coefficient for marketing spend ❌ High VIF: > 10 indicates multicollinearity ❌ Unstable: Coefficient changes dramatically across specifications

Business reasons:

No longer relevant: Campaign ended, variable obsolete ❌ Data quality issues: Unreliable measurement ❌ Redundant: Better variable available ❌ Over-complex: Simplifying model

Step-by-Step Removal Process

Step 1: Select Variables to Remove

In Current Model Variables panel (left side):

  1. Check boxes next to variables you want to remove

  2. Multiple selection allowed

  3. Cannot remove constant term

Step 2: Click "Remove Variables" Button

Button located at bottom of Current Model Variables panel

Step 3: Review Preview Dialog

Shows model performance without removed variables:

Columns:

  • Variable names

  • Old coefficients (with removed variables)

  • New coefficients (without removed variables)

  • Coefficient changes %

  • T-stat changes

Rows for removed variables:

  • Grayed out or marked

  • Will no longer be in model

Rows for remaining variables:

  • Show updated statistics

  • May change when variables removed

Step 4: Evaluate Preview

Good removal if:

✅ R² decreases < 2% ✅ Remaining coefficients stable ✅ Remaining t-stats remain significant ✅ Model simpler, easier to interpret

Problematic removal if:

⚠️ R² drops > 5% ⚠️ Remaining coefficients change dramatically ⚠️ Signs flip for remaining variables ⚠️ Model fit unacceptable

Step 5: Confirm or Cancel

If preview acceptable:

  • Click "Remove" or "Confirm"

  • Variables permanently removed

  • Model refits without them

  • Results update

If too much impact:

  • Click "Cancel"

  • Variables remain in model

  • Reconsider removal strategy

Removing Multiple Variables

Safe scenarios:

  • All non-significant variables together

  • All variables from same category (if all problematic)

  • Variables you're confident are harmful

Risky scenarios:

  • Removing many variables at once

  • Uncertain which ones are truly problematic

  • Complex interdependencies

Better approach: Remove one at a time, especially if:

  • Some variables marginally significant

  • Uncertain about interaction effects

  • Want to understand individual impact

Strategic Removal for Model Improvement

Scenario 1: Simplifying overfitted model

Problem: Model has 30 variables, many non-significant

Solution:

  1. Sort by p-value (highest first)

  2. Remove variables with p > 0.10, one at a time

  3. After each removal, check if others become significant

  4. Stop when all remaining variables significant

Scenario 2: Resolving multicollinearity

Problem: High VIF (>10) for several variables

Solution:

  1. Identify correlated variables (VIF testing)

  2. Choose one to keep (most theoretically important)

  3. Remove others

  4. Check if VIF drops below 10

  5. If not, remove additional correlated variables

Scenario 3: Wrong coefficient signs

Problem: Variable has unexpected sign (e.g., negative for marketing spend)

Solution:

  1. Remove variable with wrong sign

  2. Check if sign corrects in absence of other variables

  3. If sign remains wrong, variable genuinely doesn't fit

  4. Investigate data quality or specification issues

Preview System Details

Why Preview Matters

Prevents mistakes:

  • See impact before committing

  • Avoid accidentally breaking model

  • Understand changes before they're permanent

Enables informed decisions:

  • Compare options

  • Weigh tradeoffs (significance vs R²)

  • Make data-driven choices

Saves time:

  • Don't need to undo changes

  • Test hypotheses risk-free

  • Learn from previews without commitment

What Preview Shows

For Additions:

New variables section:

  • Coefficient estimate

  • Standard error

  • T-statistic

  • P-value

  • 95% confidence interval

Existing variables section:

  • Old coefficient → New coefficient

  • Change percentage

  • Old t-stat → New t-stat

  • Impact of addition

Model statistics:

  • Old R² → New R²

  • R² improvement

  • Adjusted R² change

For Removals:

Removed variables section:

  • Final statistics before removal

  • Marked as "Will be removed"

Remaining variables section:

  • Old coefficients → New coefficients

  • Changes from removal

  • New significance levels

Model statistics:

  • R² before → R² after

  • Impact of removal

Interpreting Preview Results

Coefficient changes:

  • < 5%: Very stable, good

  • 5-10%: Minor change, acceptable

  • 10-30%: Moderate change, investigate

  • > 30%: Large change, concerning

T-stat changes:

  • Increase: Variable became more reliable (good)

  • Decrease slightly: Minor impact (acceptable)

  • Drop below 2.0: Lost significance (concerning)

  • Sign flip: Serious structural issue (investigate)

R² changes:

  • Increase 0.01-0.05: Meaningful improvement

  • Increase > 0.05: Substantial improvement

  • Decrease < 0.02: Acceptable loss

  • Decrease > 0.05: Significant information loss

Best Practices

Building New Models

Phase 1: Foundation (R² 40-60%)

  • Add trend/seasonality if needed

  • Add strongest 1-2 marketing channels

  • Verify signs and significance

  • Establish baseline

Phase 2: Core Marketing (R² 60-75%)

  • Add remaining marketing channels

  • One at a time initially

  • Configure adstock for each

  • Remove any non-significant

Phase 3: Controls (R² 70-85%)

  • Add price, promotions

  • Add external factors

  • Add competitive variables

  • Refine and remove weak variables

Phase 4: Polish (R² 75-90%)

  • Apply transformations (curves)

  • Test interactions if relevant

  • Final removal of marginally significant

  • Validate diagnostics

Variable Selection Principles

Parsimony: Prefer simpler models

Simpler models:

  • Easier to interpret

  • More stable

  • Better for optimization

  • Lower risk of overfitting

Occam's Razor: Don't add complexity without improvement

Only add variables if:

  • Significantly improve R²

  • Statistically significant

  • Make business sense

Theoretical grounding: Include variables for reasons

Good reasons:

  • Marketing theory (media affects sales)

  • Business knowledge (price affects demand)

  • Domain expertise (seasonality matters)

Bad reasons:

  • "It increased R² by 0.001"

  • "Let's try everything"

  • "More variables = better"

Common Mistakes to Avoid

Adding all variables at once

  • Can't isolate effects

  • Hard to debug issues

  • Likely multicollinearity

Instead: Add systematically, test impact

Keeping non-significant variables

  • Reduces statistical power

  • Inflates standard errors

  • Overcomplicates model

Instead: Remove p > 0.10 unless strong business reason

Ignoring coefficient signs

  • Marketing spend shouldn't reduce sales

  • Price increase shouldn't increase sales

  • Violates logic

Instead: Remove variables with wrong signs, investigate cause

Removing variables to maximize R²

  • R² isn't everything

  • Can remove important controls

  • Leads to biased estimates

Instead: Balance R², significance, and interpretation

Troubleshooting

Preview shows dramatic coefficient changes

Cause: Multicollinearity between new and existing variables

Solution:

  1. Check correlation between new and existing variables

  2. Use Variable Testing → VIF Analysis

  3. Choose one variable to keep, remove others

  4. Consider creating weighted/combined variable

Can't add variable - already exists

Cause: Variable name already in model

Solution:

  • Check if adstock version exists (TV_Spend vs TV_Spend_adstock_70)

  • Different adstock rates create different variables

  • Remove existing version before adding new one

Model fit fails after addition

Cause: Perfect multicollinearity or singular matrix

Solution:

  • Remove last added variable

  • Check if new variable is exact linear combination of existing

  • Ensure you're not including both original and transformed versions

R² doesn't change after addition

Cause: Variable is redundant or perfectly explained by existing variables

Solution:

  • Variable adds no information

  • Remove it

  • Find different variable to test

Key Takeaways

  • Add variables systematically to understand individual impact

  • Preview changes before committing to avoid mistakes

  • Remove non-significant variables (p > 0.10) to maintain model quality

  • Configure adstock for media variables during addition

  • Watch for coefficient stability - changes > 30% indicate issues

  • Build iteratively: core → marketing → controls → refinement

  • Simpler models are better - only add variables that meaningfully improve model

  • Use preview system to test hypotheses risk-free before committing

Last updated