Creating Binary Indicators Based on Value Thresholds
AVO (Above Value Operator) creates binary (0/1) dummy variables based on whether values fall above or below a threshold. This transformation is useful for creating "activity" or "availability" indicators that capture when a channel is active at meaningful levels.
What is AVO?
Threshold-Based Binary Variable
AVO converts continuous values into binary indicators:
Original Variable (Continuous):
TV_Spend: $5K, $12K, $0, $8K, $25K, $0, $18K...
AVO 90 (Binary - top 10%):
TV_AVO_90: 0, 0, 0, 0, 1, 0, 0...
Logic:
Calculate threshold based on value range
Values β₯ threshold β 1 (active/available)
Values < threshold β 0 (inactive)
How AVO Threshold Works
Pure Range-Based Calculation
Formula:
Example with AVO 90:
Result: Only periods with TV spend β₯ $22.5K get marked as "1"
AVO Percentage Interpretation
Understanding the Threshold
AVO 90 = Top 10% of value range
90% of range excluded (below threshold)
10% of range included (above threshold)
Only highest spend periods marked as "1"
AVO 50 = Top 50% of value range
50% of range excluded
50% of range included
Above-median spend marked as "1"
AVO 10 = Top 90% of value range
10% of range excluded
90% of range included
Most non-zero periods marked as "1"
Visual Example
Original TV_Spend:
Result - TV_AVO_90:
When to Use AVO
Use Case 1: Campaign Flight Detection
Problem: Identify when campaigns are running at meaningful levels
Solution:
Interpretation: Captures weeks with top 20% of TV spending (heavy campaign periods)
Benefit: Model can separately estimate effect of "being on-air at high levels"
Use Case 2: Availability Indicators
Problem: Model "presence" vs. "absence" of marketing
Solution:
Interpretation: Binary indicator for "Radio is active" (above median spend)
Use in Model: Interaction with other variables to test amplification effects
Use Case 3: Flighting Analysis
Problem: Understand impact of sustained high-spending periods
Solution:
Interpretation: Identifies periods of concentrated digital investment
Use Case 4: Threshold Effects
Problem: Test if channel only works above certain spend level
Solution:
Interpretation:
OOH_Spend coefficient: Continuous effect
OOH_AVO_60 coefficient: Additional boost when spending is in top 40% of range
Creating AVO Variables
In Variable Workshop
Step 1: Select base variable (e.g., TV_Spend)
Step 2: Choose "Create AVO"
Step 3: Set threshold percentage (0-100):
90: Top 10% of value range β 1
80: Top 20% of value range β 1
70: Top 30% of value range β 1
50: Top 50% of value range β 1
Step 4: Optional identifier (or defaults to threshold number)
Step 5: Preview shows:
Original values
Threshold value calculated
Binary result (0/1)
Count of 1s vs. 0s
Step 6: Create variable
Naming Convention
Format:
Choosing the Right Threshold
High Thresholds (80-95)
When to Use:
Identify only the most intense campaign periods
Test impact of "heavy investment weeks"
Rare events you want to flag
Result: Very few 1s (5-20% of observations)
Example:
Medium Thresholds (50-75)
When to Use:
General "active vs. inactive" indicator
Moderate campaign intensity
Balanced binary split
Result: Moderate 1s (25-50% of observations)
Example:
Low Thresholds (10-40)
When to Use:
Identify "any meaningful activity"
Most non-zero spending periods
Broad availability indicator
Result: Many 1s (60-90% of observations)
Example:
Interpreting AVO in Models
As Main Effect
Model:
Ξ²β Interpretation: Average sales lift during periods when TV spending is in top 10% of range
With Continuous Variable
Model:
Ξ²β: Effect of each dollar of TV spend Ξ²β: Additional fixed effect when TV spending is very high
Interpretation: TV has continuous effect (Ξ²β), PLUS extra boost (Ξ²β) during heavy campaign weeks
As Interaction Term
Model:
Ξ²β Interpretation: Does digital spending work better during heavy TV weeks?
If Ξ²β > 0: Yes, synergy effect when TV is at high levels
Common Patterns
Testing Threshold Effects
Create multiple AVO thresholds:
Test each in model: Which threshold best explains KPI variance?
Example Result:
Use: AVO 80 (most significant)
Best Practices
β Do's
Use for Campaign Flights AVO perfect for identifying burst spending periods
Test Multiple Thresholds Create 2-3 different thresholds, see which performs best statistically
Combine with Continuous Model both TV_Spend (continuous) and TV_AVO_90 (binary) for full picture
Use Descriptive IdentifiersTV|AVO High instead of just TV|AVO 90 for clarity
Check Distribution Preview shows how many 1s vs. 0s - ensure reasonable split
Document Threshold Choice Record why 90 vs. 80 vs. 70 was chosen
β Don'ts
Don't Use Too High Threshold AVO 99 might give only 1-2 observations with "1" - insufficient data
Don't Use Too Low Threshold AVO 5 means almost everything is "1" - not informative
Don't Confuse with Percentile AVO 90 β 90th percentile of data AVO 90 = 90% of value RANGE (min to max)
Don't Ignore Zeros If variable has many zeros, AVO threshold still based on min (often 0) to max
Don't Use as Only Variable Usually best combined with continuous spend variable
AVO vs. Other Transformations
AVO vs. Simple Threshold
Simple Threshold:
Fixed arbitrary threshold
AVO:
Adaptive based on data range
Benefit: AVO adapts to your specific data distribution
AVO vs. Standardization
Standardization: Converts to z-scores (mean=0, std=1) - continuous
AVO: Converts to binary (0 or 1) - categorical
Use AVO when: You want to test presence/absence effects, not magnitude
Example Use Case
Identifying High-Spend Campaign Periods
Data:
Create:
Calculation:
Result:
Model:
Interpretation:
TV_Spend coefficient: Continuous linear effect
TV_AVO_85 coefficient: Extra lift during heavy campaign weeks
Example Coefficients:
Business Insight: Heavy TV weeks not only generate continuous returns (0.5 per dollar) but also create additional $15K boost (maybe from amplified awareness, buzz, etc.)
Summary
Key Takeaways:
π― AVO = Above Value Operator - binary indicator based on threshold
π Threshold = Range-based - MIN + (Percentage Γ Range)
π’ AVO 90 = Top 10% of value range (not 90th percentile!)
β Perfect for campaign flights - identify high-intensity periods
π Combine with continuous - model both amount (continuous) and presence/intensity (AVO)
π Test multiple thresholds - find statistically optimal cutoff
π‘ Adaptive to your data - threshold calculates based on actual min/max
π¬ Common use: Flighting - when is channel "really active" vs. baseline?
AVO transforms continuous spending into actionable binary indicators, perfect for testing threshold effects and identifying high-impact periods!