Data Upload & Format Requirements

Getting Your Data Ready for MixModeler

MixModeler accepts Excel files (.xlsx, .xls) and CSV files containing your historical marketing data. This page explains the exact format requirements, column structure, file preparation, and common data issues to avoid.


Supported File Formats

Supported Formats:

  • .xlsx (Excel 2007 and later)

  • .xls (Excel 97-2003)

Advantages:

  • Multiple sheets supported (model uses first sheet by default)

  • Preserves data types (dates, numbers, text)

  • Easy to prepare and validate

  • Familiar format for most users

File Size Limits:

  • Free: Up to 5 MB

  • Professional: Up to 20 MB

  • Business: Up to 50 MB


CSV Files

Supported Formats:

  • .csv (Comma-separated values)

  • UTF-8 encoding recommended

Advantages:

  • Universal compatibility

  • Lightweight file size

  • Easy export from databases/analytics tools

Considerations:

  • Must use comma (,) as delimiter

  • Dates should be formatted as text (YYYY-MM-DD)

  • Use quotes for text fields containing commas


Required Data Structure

The Fundamental Rule: Wide Format

Your data must be in wide format where:

  • Each row = one time period (week, month, etc.)

  • Each column = one variable (KPI, marketing channel, external factor)

✅ Correct Wide Format:

❌ Incorrect Long Format:


Column Structure Requirements

First Column: "Observation" (Required)

Purpose: Time period identifier (date, week number, period label)

Name: Must be exactly Observation (case-sensitive)

Accepted Formats:

  • Dates: 2023-01-01, 01/01/2023, 2023-W01

  • Period labels: Week 1, Month 1, Period 1

  • Custom: Any unique identifier per time period

Critical Requirements: ✅ Column must be named exactly Observation ✅ Values must be unique (no duplicate periods) ✅ Should be in chronological order (earliest to latest)

Example:


Second Column Onward: Variables

Purpose: Your KPI and predictor variables (marketing spend, external factors, etc.)

Naming Rules:

  • Descriptive names: TV_Spend, Digital_Display, Email_Marketing

  • No special characters: Avoid @, #, $, %, &, *

  • Underscores OK: TV_Spend

  • Spaces discouraged: Use underscores instead (TV SpendTV_Spend)

  • No leading/trailing spaces: TV_Spend ✅ | TV_Spend

Data Types:

  • Numeric: All values should be numbers (marketing spend, sales, impressions)

  • No text in data rows: Headers only (row 1)

  • No currency symbols: 10000 ✅ | $10,000

  • No percentage symbols: 0.15 ✅ | 15%


Complete Example File

Here's a properly formatted dataset:

Key Features:

  • ✅ First column named Observation

  • ✅ Dates in chronological order

  • ✅ All data values are numbers

  • ✅ Clear, descriptive column names

  • ✅ No special characters or symbols

  • ✅ Binary variable (Holiday: 0 or 1)


Data Requirements

Minimum Data Requirements

Time Periods:

  • Absolute Minimum: 26 observations (6 months of weekly data)

  • Recommended Minimum: 52 observations (1 year of weekly data)

  • Ideal: 104+ observations (2+ years of weekly data)

Rationale: More data = more statistical power to detect marketing effects

Why 52+ weeks is ideal:

  • Captures full annual seasonality (holidays, quarters)

  • Sufficient degrees of freedom for multiple variables

  • Better model stability and convergence


Maximum Data Capacity

Subscription-Based Limits:

Practical Limits: Most models perform well with:

  • 10-30 marketing variables

  • 52-156 time periods (1-3 years of weekly data)


Variable Count Guidelines

How Many Variables Should You Include?

Rule of Thumb: Number of observations should be at least 5× the number of variables

Examples:

Reasoning: Too many variables relative to observations leads to overfitting and unstable estimates.


What Variables to Include?

Essential Variables:

  1. KPI (Dependent Variable): Sales, Revenue, Conversions, Transactions

  2. Marketing Channels: All major paid media (TV, Digital, Radio, Print, etc.)

  3. Baseline/Organic: If available (organic traffic, brand search, etc.)

Recommended Additional Variables: 4. Seasonality: Month dummies, quarter indicators, holiday flags 5. Pricing: Average product price, discount percentage, promotion indicators 6. External Factors: Competitor activity, macroeconomic indicators, weather 7. Distribution: Store count, geographic expansion, availability

Lower Priority Variables:

  • Niche marketing channels with minimal spend

  • Experimental channels with limited history

  • Variables with no expected relationship to KPI


Data Quality Checklist

Before uploading your file, verify:

✅ Structure Checks


✅ Data Checks


✅ Completeness Checks


Common Data Issues and Fixes

Issue 1: First Column Not Named "Observation"

Symptom: Upload fails or model doesn't recognize time periods

Fix: Rename first column to exactly Observation (case-sensitive)


Issue 2: Numbers Formatted as Text

Symptom: Model treats numbers as categories or upload fails

Indicators:

  • Numbers aligned left in Excel (text alignment)

  • Green corner triangles in Excel cells

  • Numbers have leading/trailing spaces

Fix:

  1. Select all data columns

  2. Excel: Data → Text to Columns → Finish

  3. Verify numbers are right-aligned


Issue 3: Currency Symbols in Data

Symptom: Values not recognized as numbers

Example: $10,000 instead of 10000

Fix:

  1. Find & Replace: Remove $, , £ symbols

  2. Find & Replace: Remove commas (,) from numbers

  3. Ensure cells are formatted as "Number" not "Currency"


Issue 4: Missing Data (Blank Cells)

Symptom: Model encounters errors or treats missing as zero

Fix:

  • Option 1 (Preferred): Fill with zero if truly no activity/spend

  • Option 2: Interpolate based on adjacent values

  • Option 3: Exclude time periods with missing data

In MixModeler: Missing values are typically treated as zero, so explicitly enter 0 for clarity.


Issue 5: Inconsistent Time Periods

Symptom: Gaps in weekly/monthly data

Example:

2023-01-01
2023-01-08
2023-01-22  ← Missing 2023-01-15

Fix: Add row for missing period with appropriate data (or zeros if no activity)

Why it matters: Time series models assume consistent intervals between observations.


Issue 6: Wrong Data Format (Long Instead of Wide)

Symptom: Channels listed in rows instead of columns

Fix: Use Excel's Pivot Table to convert:

  1. Create Pivot Table

  2. Rows: Date/Observation

  3. Columns: Channel

  4. Values: Spend/Impressions

  5. Copy pivot results to new sheet


Time Period Best Practices

Advantages:

  • Captures short-term marketing effects

  • Sufficient granularity for most channels

  • 52 data points per year (good statistical power)

When to Use:

  • Digital marketing with frequent campaign changes

  • Retail with weekly promotions

  • Fast-moving consumer goods (FMG)


Monthly Data

Advantages:

  • Easier to aggregate from accounting systems

  • Smoother trends, less noise

  • Works well for longer purchase cycles

When to Use:

  • B2B with long sales cycles

  • Big-ticket purchases (cars, luxury goods)

  • Limited data available (only monthly reports)

Trade-offs:

  • Only 12-36 data points per 1-3 years (lower statistical power)

  • May miss short-term effects


Daily Data (Advanced)

Advantages:

  • Maximum granularity

  • Captures immediate response (search, social)

  • Large sample size (365+ observations per year)

Challenges:

  • High day-to-day volatility

  • Day-of-week effects require additional variables

  • Large datasets (computational considerations)

  • Harder to interpret patterns

When to Use:

  • E-commerce with daily transactions

  • Digital-only businesses

  • When you have 2+ years of daily data

Recommendation: Weekly data is the sweet spot for most MMM applications.


File Preparation Workflow

Step 1: Gather Raw Data

From Marketing Platforms:

  • Google Ads: Campaign spend by week/month

  • Facebook Ads: Ad spend by week/month

  • TV/Radio: Nielsen or equivalent media spend data

  • Print: Direct from invoices or media agency

From Analytics:

  • Google Analytics: Website traffic, conversions

  • CRM: Sales, revenue, transaction data

  • Internal databases: Pricing, promotions, inventory


Step 2: Standardize Time Periods

Ensure Consistency:

  • All data on same frequency (all weekly OR all monthly)

  • Align dates (e.g., all weeks start on Monday)

  • Fill any gaps with zeros or interpolated values

Example Weekly Alignment:

  • Week 1: Jan 1 - Jan 7 (Sunday to Saturday)

  • Week 2: Jan 8 - Jan 14

  • Week 3: Jan 15 - Jan 21


Step 3: Create Master Spreadsheet

Excel Setup:

  1. Open new Excel workbook

  2. Name first column Observation

  3. Add date/period values in column A

  4. Add each variable as subsequent column

  5. Ensure no merged cells or formatting

Column Order:

A: Observation (dates)
B: KPI (Sales, Revenue, Conversions)
C-Z: Marketing Variables (TV, Digital, Radio, etc.)
AA+: Control Variables (Seasonality, Price, etc.)

Step 4: Clean and Validate

Remove:

  • Currency symbols

  • Percentage symbols

  • Thousands separators

  • Extra spaces in column names

  • Special characters

Verify:

  • All data is numeric

  • No blank cells (use 0 if no activity)

  • Dates are chronological

  • No duplicate rows


Step 5: Save and Upload

  1. Save as .xlsx format

  2. Navigate to MixModeler

  3. Click "Upload Data" or "Load Data"

  4. Select your file

  5. Wait for validation confirmation


Post-Upload Validation

After uploading, MixModeler will:

Automatic Checks

✅ File Format Validation

  • Checks file type (.xlsx, .xls, .csv)

  • Verifies file size within limits

✅ Structure Validation

  • Confirms "Observation" column exists

  • Checks for numeric data types

  • Identifies column names

✅ Data Quality Checks

  • Counts observations (warns if < 26)

  • Counts variables (warns if exceeds subscription limit)

  • Identifies missing values


Data Preview

After upload, you'll see:

Summary Statistics:

  • Number of observations

  • Number of variables

  • Date range (first to last observation)

Preview Table:

  • First 10 rows of your data

  • All columns visible

  • Ability to scroll and inspect

Variable List:

  • All detected variable names

  • Data types assigned

  • Summary stats (min, max, mean)


Example Files for Download

MixModeler provides sample templates:

Basic Template

Includes:

  • Observation column (52 weeks)

  • Sales (KPI example)

  • TV_Spend

  • Digital_Spend

  • Radio_Spend

  • Holiday indicator

Use Case: Quick start for simple models


Advanced Template

Includes:

  • Observation column (104 weeks)

  • Multiple KPIs (Sales, Revenue, Units)

  • 10+ marketing channels

  • Seasonality variables (Month dummies)

  • External factors (Competitor index, Economic indicators)

Use Case: Comprehensive MMM with many variables


Troubleshooting Upload Issues

Error: "First column must be named 'Observation'"

Cause: Column A is not named Observation

Fix: Rename first column to exactly Observation (case-sensitive)


Error: "File size exceeds limit"

Cause: File too large for your subscription tier

Solutions:

  1. Remove unnecessary columns

  2. Reduce number of observations (if >200 weeks)

  3. Upgrade subscription plan

  4. Save as .csv (smaller file size)


Error: "Invalid data format detected"

Cause: Non-numeric values in data columns

Fix:

  1. Check for text in numeric columns

  2. Remove currency/percentage symbols

  3. Ensure dates are in first column only


Warning: "Less than 52 observations detected"

Cause: Fewer than recommended data points

Impact: Lower statistical power, less reliable estimates

Options:

  1. Gather more historical data if possible

  2. Proceed with caution (use Bayesian with informative priors)

  3. Aggregate to monthly data (if currently daily/weekly)


Warning: "Variable count exceeds subscription limit"

Cause: Too many columns for your plan

Fix:

  1. Remove low-priority variables

  2. Combine similar channels (e.g., combine Facebook + Instagram)

  3. Upgrade to Professional or Business plan


Data Privacy and Security

On-Device Processing

MixModeler Guarantee:

  • All data stays on your device - never uploaded to cloud servers

  • Processing happens locally in your browser

  • No server-side storage of your data

  • No data transmission to external servers

Technical Implementation:

  • JavaScript-based processing

  • Browser-based computation

  • WebAssembly for performance

  • IndexedDB for temporary local storage only


What Happens to Your Data?

During Session:

  • Data loaded into browser memory

  • Processed locally for modeling

  • Stored temporarily in browser cache

After Session:

  • Data cleared when you close browser

  • Optional: Export models to Excel (your choice)

  • No persistent storage on MixModeler servers

Benefits: ✅ Full GDPR compliance ✅ No data breach risk ✅ Complete control over sensitive data ✅ Ideal for enterprises with strict data policies


Best Practices Summary

✅ Do's

Format Your Data Correctly Wide format, "Observation" as first column, numeric values only

Include Sufficient History 52+ weeks for weekly data, 24+ months for monthly data

Use Descriptive Names TV_National, Digital_Display, Email_Marketing

Clean Before Upload Remove symbols, verify numeric formats, check for blanks

Start Simple Include essential variables first, add complexity iteratively

Validate After Upload Review data preview, check summary statistics


❌ Don'ts

Don't Use Long Format Channels should be columns, not rows

Don't Include Non-Numeric Data Text, symbols, and formatting break the upload

Don't Leave Blanks Fill missing values with 0 or appropriate values

Don't Use Special Characters Stick to letters, numbers, and underscores in names

Don't Upload Without Reviewing Always preview in Excel before upload

Don't Mix Time Frequencies All data should be weekly OR monthly, not mixed


Summary

Key Takeaways:

📋 Wide format required - rows = time periods, columns = variables

📅 "Observation" column mandatory - exact name, first column, unique values

🔢 Numeric data only - remove symbols, currency, percentages

📊 52+ observations ideal - more data = better models

🔒 Privacy-first - your data never leaves your device

Clean before upload - use checklist to validate

📁 Excel or CSV - .xlsx recommended for ease of use

🎯 Include key variables - KPI + marketing channels + seasonality

Once your data is properly formatted and uploaded, you're ready to start building powerful MMM models in MixModeler!

Last updated