Data Upload & Format Requirements
Getting Your Data Ready for MixModeler
MixModeler accepts Excel files (.xlsx, .xls) and CSV files containing your historical marketing data. This page explains the exact format requirements, column structure, file preparation, and common data issues to avoid.
Supported File Formats
Excel Files (Recommended)
Supported Formats:
.xlsx(Excel 2007 and later).xls(Excel 97-2003)
Advantages:
Multiple sheets supported (model uses first sheet by default)
Preserves data types (dates, numbers, text)
Easy to prepare and validate
Familiar format for most users
File Size Limits:
Free: Up to 5 MB
Professional: Up to 20 MB
Business: Up to 50 MB
CSV Files
Supported Formats:
.csv(Comma-separated values)UTF-8 encoding recommended
Advantages:
Universal compatibility
Lightweight file size
Easy export from databases/analytics tools
Considerations:
Must use comma (
,) as delimiterDates should be formatted as text (YYYY-MM-DD)
Use quotes for text fields containing commas
Required Data Structure
The Fundamental Rule: Wide Format
Your data must be in wide format where:
Each row = one time period (week, month, etc.)
Each column = one variable (KPI, marketing channel, external factor)
✅ Correct Wide Format:
❌ Incorrect Long Format:
Column Structure Requirements
First Column: "Observation" (Required)
Purpose: Time period identifier (date, week number, period label)
Name: Must be exactly Observation (case-sensitive)
Accepted Formats:
Dates:
2023-01-01,01/01/2023,2023-W01Period labels:
Week 1,Month 1,Period 1Custom: Any unique identifier per time period
Critical Requirements: ✅ Column must be named exactly Observation ✅ Values must be unique (no duplicate periods) ✅ Should be in chronological order (earliest to latest)
Example:
Second Column Onward: Variables
Purpose: Your KPI and predictor variables (marketing spend, external factors, etc.)
Naming Rules:
Descriptive names:
TV_Spend,Digital_Display,Email_MarketingNo special characters: Avoid
@,#,$,%,&,*Underscores OK:
TV_Spend✅Spaces discouraged: Use underscores instead (
TV Spend→TV_Spend)No leading/trailing spaces:
TV_Spend✅ |TV_Spend❌
Data Types:
Numeric: All values should be numbers (marketing spend, sales, impressions)
No text in data rows: Headers only (row 1)
No currency symbols:
10000✅ |$10,000❌No percentage symbols:
0.15✅ |15%❌
Complete Example File
Here's a properly formatted dataset:
Key Features:
✅ First column named
Observation✅ Dates in chronological order
✅ All data values are numbers
✅ Clear, descriptive column names
✅ No special characters or symbols
✅ Binary variable (Holiday: 0 or 1)
Data Requirements
Minimum Data Requirements
Time Periods:
Absolute Minimum: 26 observations (6 months of weekly data)
Recommended Minimum: 52 observations (1 year of weekly data)
Ideal: 104+ observations (2+ years of weekly data)
Rationale: More data = more statistical power to detect marketing effects
Why 52+ weeks is ideal:
Captures full annual seasonality (holidays, quarters)
Sufficient degrees of freedom for multiple variables
Better model stability and convergence
Maximum Data Capacity
Subscription-Based Limits:
Practical Limits: Most models perform well with:
10-30 marketing variables
52-156 time periods (1-3 years of weekly data)
Variable Count Guidelines
How Many Variables Should You Include?
Rule of Thumb: Number of observations should be at least 5× the number of variables
Examples:
Reasoning: Too many variables relative to observations leads to overfitting and unstable estimates.
What Variables to Include?
Essential Variables:
KPI (Dependent Variable): Sales, Revenue, Conversions, Transactions
Marketing Channels: All major paid media (TV, Digital, Radio, Print, etc.)
Baseline/Organic: If available (organic traffic, brand search, etc.)
Recommended Additional Variables: 4. Seasonality: Month dummies, quarter indicators, holiday flags 5. Pricing: Average product price, discount percentage, promotion indicators 6. External Factors: Competitor activity, macroeconomic indicators, weather 7. Distribution: Store count, geographic expansion, availability
Lower Priority Variables:
Niche marketing channels with minimal spend
Experimental channels with limited history
Variables with no expected relationship to KPI
Data Quality Checklist
Before uploading your file, verify:
✅ Structure Checks
✅ Data Checks
✅ Completeness Checks
Common Data Issues and Fixes
Issue 1: First Column Not Named "Observation"
Symptom: Upload fails or model doesn't recognize time periods
Fix: Rename first column to exactly Observation (case-sensitive)
Issue 2: Numbers Formatted as Text
Symptom: Model treats numbers as categories or upload fails
Indicators:
Numbers aligned left in Excel (text alignment)
Green corner triangles in Excel cells
Numbers have leading/trailing spaces
Fix:
Select all data columns
Excel: Data → Text to Columns → Finish
Verify numbers are right-aligned
Issue 3: Currency Symbols in Data
Symptom: Values not recognized as numbers
Example: $10,000 instead of 10000
Fix:
Find & Replace: Remove
$,€,£symbolsFind & Replace: Remove commas (
,) from numbersEnsure cells are formatted as "Number" not "Currency"
Issue 4: Missing Data (Blank Cells)
Symptom: Model encounters errors or treats missing as zero
Fix:
Option 1 (Preferred): Fill with zero if truly no activity/spend
Option 2: Interpolate based on adjacent values
Option 3: Exclude time periods with missing data
In MixModeler: Missing values are typically treated as zero, so explicitly enter 0 for clarity.
Issue 5: Inconsistent Time Periods
Symptom: Gaps in weekly/monthly data
Example:
Fix: Add row for missing period with appropriate data (or zeros if no activity)
Why it matters: Time series models assume consistent intervals between observations.
Issue 6: Wrong Data Format (Long Instead of Wide)
Symptom: Channels listed in rows instead of columns
Fix: Use Excel's Pivot Table to convert:
Create Pivot Table
Rows: Date/Observation
Columns: Channel
Values: Spend/Impressions
Copy pivot results to new sheet
Time Period Best Practices
Weekly Data (Recommended for Most Cases)
Advantages:
Captures short-term marketing effects
Sufficient granularity for most channels
52 data points per year (good statistical power)
When to Use:
Digital marketing with frequent campaign changes
Retail with weekly promotions
Fast-moving consumer goods (FMG)
Monthly Data
Advantages:
Easier to aggregate from accounting systems
Smoother trends, less noise
Works well for longer purchase cycles
When to Use:
B2B with long sales cycles
Big-ticket purchases (cars, luxury goods)
Limited data available (only monthly reports)
Trade-offs:
Only 12-36 data points per 1-3 years (lower statistical power)
May miss short-term effects
Daily Data (Advanced)
Advantages:
Maximum granularity
Captures immediate response (search, social)
Large sample size (365+ observations per year)
Challenges:
High day-to-day volatility
Day-of-week effects require additional variables
Large datasets (computational considerations)
Harder to interpret patterns
When to Use:
E-commerce with daily transactions
Digital-only businesses
When you have 2+ years of daily data
Recommendation: Weekly data is the sweet spot for most MMM applications.
File Preparation Workflow
Step 1: Gather Raw Data
From Marketing Platforms:
Google Ads: Campaign spend by week/month
Facebook Ads: Ad spend by week/month
TV/Radio: Nielsen or equivalent media spend data
Print: Direct from invoices or media agency
From Analytics:
Google Analytics: Website traffic, conversions
CRM: Sales, revenue, transaction data
Internal databases: Pricing, promotions, inventory
Step 2: Standardize Time Periods
Ensure Consistency:
All data on same frequency (all weekly OR all monthly)
Align dates (e.g., all weeks start on Monday)
Fill any gaps with zeros or interpolated values
Example Weekly Alignment:
Week 1: Jan 1 - Jan 7 (Sunday to Saturday)
Week 2: Jan 8 - Jan 14
Week 3: Jan 15 - Jan 21
Step 3: Create Master Spreadsheet
Excel Setup:
Open new Excel workbook
Name first column
ObservationAdd date/period values in column A
Add each variable as subsequent column
Ensure no merged cells or formatting
Column Order:
Step 4: Clean and Validate
Remove:
Currency symbols
Percentage symbols
Thousands separators
Extra spaces in column names
Special characters
Verify:
All data is numeric
No blank cells (use 0 if no activity)
Dates are chronological
No duplicate rows
Step 5: Save and Upload
Save as
.xlsxformatNavigate to MixModeler
Click "Upload Data" or "Load Data"
Select your file
Wait for validation confirmation
Post-Upload Validation
After uploading, MixModeler will:
Automatic Checks
✅ File Format Validation
Checks file type (.xlsx, .xls, .csv)
Verifies file size within limits
✅ Structure Validation
Confirms "Observation" column exists
Checks for numeric data types
Identifies column names
✅ Data Quality Checks
Counts observations (warns if < 26)
Counts variables (warns if exceeds subscription limit)
Identifies missing values
Data Preview
After upload, you'll see:
Summary Statistics:
Number of observations
Number of variables
Date range (first to last observation)
Preview Table:
First 10 rows of your data
All columns visible
Ability to scroll and inspect
Variable List:
All detected variable names
Data types assigned
Summary stats (min, max, mean)
Example Files for Download
MixModeler provides sample templates:
Basic Template
Includes:
Observation column (52 weeks)
Sales (KPI example)
TV_Spend
Digital_Spend
Radio_Spend
Holiday indicator
Use Case: Quick start for simple models
Advanced Template
Includes:
Observation column (104 weeks)
Multiple KPIs (Sales, Revenue, Units)
10+ marketing channels
Seasonality variables (Month dummies)
External factors (Competitor index, Economic indicators)
Use Case: Comprehensive MMM with many variables
Troubleshooting Upload Issues
Error: "First column must be named 'Observation'"
Cause: Column A is not named Observation
Fix: Rename first column to exactly Observation (case-sensitive)
Error: "File size exceeds limit"
Cause: File too large for your subscription tier
Solutions:
Remove unnecessary columns
Reduce number of observations (if >200 weeks)
Upgrade subscription plan
Save as .csv (smaller file size)
Error: "Invalid data format detected"
Cause: Non-numeric values in data columns
Fix:
Check for text in numeric columns
Remove currency/percentage symbols
Ensure dates are in first column only
Warning: "Less than 52 observations detected"
Cause: Fewer than recommended data points
Impact: Lower statistical power, less reliable estimates
Options:
Gather more historical data if possible
Proceed with caution (use Bayesian with informative priors)
Aggregate to monthly data (if currently daily/weekly)
Warning: "Variable count exceeds subscription limit"
Cause: Too many columns for your plan
Fix:
Remove low-priority variables
Combine similar channels (e.g., combine Facebook + Instagram)
Upgrade to Professional or Business plan
Data Privacy and Security
On-Device Processing
MixModeler Guarantee:
All data stays on your device - never uploaded to cloud servers
Processing happens locally in your browser
No server-side storage of your data
No data transmission to external servers
Technical Implementation:
JavaScript-based processing
Browser-based computation
WebAssembly for performance
IndexedDB for temporary local storage only
What Happens to Your Data?
During Session:
Data loaded into browser memory
Processed locally for modeling
Stored temporarily in browser cache
After Session:
Data cleared when you close browser
Optional: Export models to Excel (your choice)
No persistent storage on MixModeler servers
Benefits: ✅ Full GDPR compliance ✅ No data breach risk ✅ Complete control over sensitive data ✅ Ideal for enterprises with strict data policies
Best Practices Summary
✅ Do's
Format Your Data Correctly Wide format, "Observation" as first column, numeric values only
Include Sufficient History 52+ weeks for weekly data, 24+ months for monthly data
Use Descriptive Names TV_National, Digital_Display, Email_Marketing
Clean Before Upload Remove symbols, verify numeric formats, check for blanks
Start Simple Include essential variables first, add complexity iteratively
Validate After Upload Review data preview, check summary statistics
❌ Don'ts
Don't Use Long Format Channels should be columns, not rows
Don't Include Non-Numeric Data Text, symbols, and formatting break the upload
Don't Leave Blanks Fill missing values with 0 or appropriate values
Don't Use Special Characters Stick to letters, numbers, and underscores in names
Don't Upload Without Reviewing Always preview in Excel before upload
Don't Mix Time Frequencies All data should be weekly OR monthly, not mixed
Summary
Key Takeaways:
📋 Wide format required - rows = time periods, columns = variables
📅 "Observation" column mandatory - exact name, first column, unique values
🔢 Numeric data only - remove symbols, currency, percentages
📊 52+ observations ideal - more data = better models
🔒 Privacy-first - your data never leaves your device
✅ Clean before upload - use checklist to validate
📁 Excel or CSV - .xlsx recommended for ease of use
🎯 Include key variables - KPI + marketing channels + seasonality
Once your data is properly formatted and uploaded, you're ready to start building powerful MMM models in MixModeler!
Last updated