Scatter Plots
What Are Scatter Plots?
Scatter plots display the relationship between two variables, with each point representing one time period. They help you see if two variables move together, move in opposite directions, or are unrelated.
Purpose: Explore correlations between two variables, validate expected relationships, and identify outliers.
When to Use
Best For:
- Checking if two variables are related 
- Validating expected correlations (e.g., Spend vs. Sales) 
- Detecting non-linear relationships 
- Identifying outliers and unusual observations 
- Exploring diminishing returns or saturation effects 
Examples:
- TV Spend vs. KPI 
- Price vs. Sales Volume 
- Digital Spend vs. Conversion Rate 
- Competitor Activity vs. Market Share 
How to Create
Requirements:
- Select exactly 2 variables 
- Click "Scatter Plot" button (📍) 
- Click "Generate Chart" 
Chart Layout:
- X-axis: First selected variable 
- Y-axis: Second selected variable 
- Each dot: One time period 
- Can zoom in both X and Y directions 
Reading Scatter Plots
Positive Correlation:
- Points trend upward from left to right 
- As X increases, Y increases 
- Strong relationship if points are tight 
- Example: Higher spend → Higher sales 
Negative Correlation:
- Points trend downward 
- As X increases, Y decreases 
- Example: Higher price → Lower volume 
No Correlation:
- Random scatter, no clear pattern 
- Variables are independent 
- No predictable relationship 
Non-Linear Pattern:
- Curved relationship 
- Example: Diminishing returns (curve flattens at high spend) 
- May need saturation curves in model 
What Variables to Visualize
Marketing Variables:
- Channel Spend vs. KPI 
- Impressions vs. Conversions 
- Reach vs. Sales 
Validation Checks:
- Expected positive: Spend vs. Sales 
- Expected negative: Price vs. Volume 
- No expected relationship: Random variables 
Before Modeling:
- Test assumptions about relationships 
- Verify correlations exist 
- Identify transformation needs 
Common Patterns
Linear Relationship:
- Straight line pattern 
- Constant rate of change 
- Good for linear regression 
Diminishing Returns:
- Curve that flattens 
- Each additional dollar has less impact 
- Needs saturation curve transformation 
Threshold Effect:
- Flat then steep increase 
- Minimum spend needed for impact 
- Consider S-curve transformation 
Clustered with Outliers:
- Most points together 
- Few points far away 
- Investigate outlier time periods 
Use Cases
Relationship Validation:
Question: Does TV spend drive sales?
Variables: TV_Spend (X) vs. Sales (Y)
Pattern: Points trend upward
Conclusion: Positive relationship confirmedSaturation Detection:
Variables: Total_Marketing_Spend vs. KPI
Pattern: Curve flattens at high spend
Insight: Diminishing returns at $500K+ spendOutlier Investigation:
Variables: Digital_Spend vs. Conversions
Pattern: Most clustered, 2 points far away
Action: Check those time periods for data issuesInteractive Features
Zoom:
- Mouse wheel or selection box 
- Zoom both X and Y axes 
- Focus on specific value ranges 
Tooltip:
- Hover over any point 
- Shows X value, Y value, and date 
- Identify specific time periods 
Pan:
- Click and drag when zoomed 
- Navigate to different regions 
Tips
Variable Selection:
- Choose variables you expect to be related 
- Test one relationship at a time 
- Use multiple scatter plots for multiple pairs 
Interpretation:
- Tight clustering = strong relationship 
- Wide scatter = weak relationship 
- Look for overall trend, not individual points 
Next Steps:
- If strong correlation → Include in model 
- If no correlation → Reconsider variable 
- If non-linear → Apply transformation 
When to Use Other Charts:
- Time trends → Use Line Charts 
- Multiple variables → Use Correlation Matrix 
- Multicollinearity → Use Correlation Heatmap 
Summary
Scatter Plots Show:
- Relationship between 2 variables 
- Correlation strength and direction 
- Non-linear patterns 
- Outliers 
Use scatter plots to validate assumptions before modeling and identify variables that truly drive your KPI.
Last updated