Date Filtering
Control Which Time Periods Are Included in Your Model
Overview
Date filtering allows you to include or exclude specific observations (weeks, months) from your model without deleting data. This is essential for seasonal analysis, campaign evaluation, outlier removal, and testing model stability across different time periods.
Purpose: Selectively include/exclude time periods from modeling while preserving complete dataset
Access: Model Library → Select one model → Click "Filter Model Period"
What is Date Filtering?
The Concept
Your dataset: 104 weeks of data (2 years)
Without filtering:
Model uses all 104 weeks
Estimates coefficients based on full period
R² calculated across all observations
With filtering:
Select which weeks to include (e.g., only Q4: 26 weeks)
Model estimates using only included weeks
R² calculated on filtered subset
Excluded data preserved, not deleted
Why Filter?
Seasonal analysis:
Model Q4 separately (holiday effects different)
Compare summer vs. winter coefficients
Understand seasonal variation in channel effectiveness
Campaign evaluation:
Include only campaign period
Exclude pre-launch baseline
Isolate campaign-specific effects
Outlier removal:
Exclude periods with unusual events
Remove weeks with data quality issues
Improve model stability
Model validation:
Train on first 80% of data
Test on last 20%
Assess out-of-sample performance
Data quality:
Exclude weeks with missing values
Remove periods before channels launched
Clean modeling dataset
How Date Filtering Works
Observation Mask
Mechanism: Binary mask (1 or 0) for each observation
1 = Include:
Observation used in model
Contributes to coefficient estimation
Included in R² calculation
0 = Exclude:
Observation NOT used in model
Skipped in estimation
Not in R² calculation
Data still exists in dataset
Example mask:
Data Preservation
Critical: Original data never deleted
What gets filtered:
model_data(used for regression fitting)
What stays intact:
original_data(complete dataset)data(full dataset for charts, decomposition)
Benefit:
Can un-filter anytime
Full data available for visualization
Decomposition can use all periods
No data loss
Using the Filter Interface
Step 1: Open Filter Dialog
In Model Library:
Select ONE model (checkbox)
Click "Filter Model Period" button
Filter panel opens below
Shows:
Model name
List of all observations
Current inclusion status (1 or 0)
Step 2: View Observations
Table displays:
Date/Observation: Time period identifier
KPI Value: Dependent variable value for that period
Included: Current status (1 or 0)
Checkbox: Select observation for batch changes
Information:
Total observations in dataset
Currently included count
Currently excluded count
Step 3: Modify Inclusion
Individual observation:
Click on "Included" value for specific observation
Toggles between 1 and 0
Change reflects immediately in interface
Multiple observations:
Check boxes for observations to change
Click "Include Selected" or "Exclude Selected" button
All selected observations change together
Select All:
Check "Select All" header checkbox
All observations selected
Apply Include or Exclude to all
Step 4: Apply Filter
Click "Apply Filter" button
What happens:
Observation mask saved to model
Model refits using only included observations
New R² calculated
Coefficients re-estimated
All statistics updated
Confirmation:
Success message shows
"X of Y observations included" displayed
Model statistics reflect filtered data
Step 5: View Results
Updated model shows:
New R² based on filtered data
New coefficients
Statistics from included observations only
Compare to original:
Clone model before filtering
Run both filtered and unfiltered versions
Compare coefficients and R²
Common Filtering Scenarios
Scenario 1: Seasonal Model (Q4 Only)
Goal: Understand holiday-period marketing effectiveness
Process:
Open filter for model
Identify Q4 weeks (Oct-Dec)
Exclude all other weeks (set to 0)
Include only Q4 weeks (set to 1)
Apply filter
Result:
Model estimates Q4-specific coefficients
TV may have higher coefficient in Q4 (gift buying)
Digital may be more effective (online shopping surge)
Business use:
Optimize Q4 budget allocation
Understand seasonal channel mix
Plan holiday campaigns
Scenario 2: Campaign Evaluation
Goal: Measure incremental impact of 8-week campaign
Process:
Identify campaign weeks (e.g., Weeks 40-47)
Exclude pre-campaign baseline (Weeks 1-39)
Exclude post-campaign (Weeks 48-52)
Include only campaign period (Weeks 40-47)
Apply filter
Result:
Model shows campaign-period effects
Isolates campaign contribution
Removes seasonality and trends from other periods
Business use:
Calculate campaign ROI
Evaluate channel effectiveness during campaign
Justify campaign investment
Scenario 3: Outlier Removal
Goal: Exclude unusual weeks that skew results
Problem weeks:
Week 12: Major snowstorm, sales abnormally low
Week 35: System outage, data incomplete
Week 52: Holiday closure, partial week
Process:
Identify problem weeks
Set those weeks to 0 (exclude)
Keep all other weeks at 1 (include)
Apply filter
Result:
More stable coefficient estimates
Higher R² (outliers removed)
Model represents "normal" periods
Business use:
Cleaner model for forecasting
Avoid outlier-driven insights
Robust optimization recommendations
Scenario 4: Train/Test Split
Goal: Validate model on hold-out data
Process:
Keep first 80% as training (Weeks 1-83)
Exclude last 20% as test set (Weeks 84-104)
Fit model on training data only
Later use model to predict test set
Result:
Model built without seeing test data
Can assess out-of-sample performance
Realistic validation of model quality
Business use:
Confidence in model predictions
Avoid overfitting
Demonstrate forecast accuracy
Best Practices
Before Filtering
Document baseline:
Note original R² and coefficients
Export model before filtering
Keep record of unfiltered version
Validate filtering rationale:
Clear business reason for filtering
Documented outlier identification
Statistical or domain justification
During Filtering
Minimum observations:
Keep at least 26 observations (rule of thumb)
More observations = more reliable estimates
Need ~10 observations per variable minimum
Be conservative:
Don't exclude too much data
Every excluded observation reduces power
Balance data quality vs. sample size
Visual check:
Review which observations excluded
Ensure makes sense
Look for unintended patterns
After Filtering
Compare results:
Filtered vs. unfiltered coefficients
How much did they change?
Do changes make sense?
Check R²:
R² increased: Good (removed noise/outliers)
R² similar: Filter had little impact
R² decreased: May have removed signal
Document changes:
Which observations excluded
Why they were excluded
Impact on coefficients and R²
Re-including Data
To undo filter:
Open filter interface
Set all observations to 1 (include)
Apply filter
Model returns to using full dataset
Effect:
Coefficients revert to original estimates (approximately)
R² returns to full-data value
Model back to baseline state
Advanced Filtering Techniques
Rolling Window Analysis
Goal: Test model stability over time
Process:
Filter to first 52 weeks, run model
Filter to weeks 13-64, run model
Filter to weeks 25-76, run model
Compare coefficients across windows
Insight:
Are coefficients stable over time?
Do channel effects change?
Is model robust or time-dependent?
Seasonal Comparison
Create separate models:
Model A: Q1 only
Model B: Q2 only
Model C: Q3 only
Model D: Q4 only
Compare:
Which channels perform best in each quarter?
How do coefficients vary seasonally?
Should you model seasons separately?
Pre/Post Analysis
Filter for event study:
Before change (e.g., new pricing strategy)
After change
Compare coefficients
Measures:
Did marketing effectiveness change?
Did channel mix optimal change?
What's the impact of structural change?
Troubleshooting
"Need at least 26 observations"
Cause: Excluded too many observations
Solution:
Include more observations
Minimum 26 needed for reliable estimation
Ideally 52+ (one year of weekly data)
Model fit deteriorated after filtering
Cause: Removed important signal or created bias
Possible reasons:
Removed periods with high variance (needed for estimation)
Created artificial patterns
Reduced sample diversity
Solution:
Review which observations excluded
Consider more inclusive filtering
Compare to unfiltered model
May need to keep problematic observations
Coefficients changed dramatically
Cause: Filtered periods had different relationships
Examples:
Excluded holiday periods where TV is more effective
Removed summer when Digital dominates
Filtered out campaign periods with different dynamics
Solution:
This may be legitimate (showing time-varying effects)
Or may indicate filtering was too aggressive
Consider if changes make business sense
Document and explain to stakeholders
Can't un-filter (all observations still excluded)
Cause: Filter dialog not updating
Solution:
Refresh page
Reopen filter interface
Select all observations
Set to 1 (include)
Apply filter again
Filtering vs. Date Range Selection
Two Different Approaches
Date Range (simple):
Include: 2024-01-01 to 2024-06-30
Exclude: Everything outside this range
Continuous period
Faster to configure
Observation Mask (flexible):
Include: Weeks 1, 3, 5, 7 (custom selection)
Exclude: Weeks 2, 4, 6, 8
Non-continuous periods allowed
Fine-grained control
When to Use Each
Use Date Range when:
Simple continuous period needed
Seasonal split (Q1, Q2, etc.)
Train/test split
Campaign period analysis
Use Observation Mask when:
Excluding specific outlier weeks
Non-continuous periods
Complex patterns (every other week)
Removing scattered bad-data observations
Note: MixModeler primarily uses observation mask for flexibility. Date range filtering is a special case that can be converted to mask.
Integration with Other Features
With Model Comparison
Compare filtered vs. unfiltered:
Clone model
Filter clone to subset
Use Model Comparison to see differences
Understand impact of filtering
Business insight: Does relationship hold across all periods or only subset?
With Decomposition
Decomposition behavior:
Uses FULL dataset for decomposition
Even if model filtered for estimation
Allows visualizing contribution over all periods
Why:
Stakeholders want to see complete picture
Filtered periods still get predictions
Can assess model performance on excluded data
With Diagnostics
Diagnostics run on:
Only INCLUDED observations
Tests based on filtered dataset
R² and statistics reflect filtered model
Consideration:
Fewer observations → less powerful tests
May affect diagnostic conclusions
Keep sufficient sample for reliable diagnostics
Export and Reimport
Filtering in Excel Export
What's exported:
Observation Mask sheet with inclusion status
Complete dataset (all observations)
Model fitted on filtered subset
Excel contains:
Column "Included_In_Model" (1 or 0)
All 104 observations listed
Shows which were used for estimation
Reimporting Filtered Models
On reimport:
MixModeler reads Observation Mask sheet
Recreates filter exactly
Fits model using same filtered subset
Perfect replication of filtered model
Benefits:
Share filtered models with colleagues
Archive specific analyses
Document filtering decisions permanently
Key Takeaways
Date filtering includes/excludes observations without deleting data
Each observation has 1 (include) or 0 (exclude) status
Minimum 26 observations recommended for reliable estimation
Model refits automatically when filter applied
Original data always preserved, can un-filter anytime
Use for seasonal analysis, campaign evaluation, outlier removal
Compare filtered vs. unfiltered models to understand impact
Decomposition uses full dataset even if model filtered for estimation
Export includes filter settings for perfect reimport
Document filtering rationale and impact for transparency
Last updated