GeoLift-SDID: Optimal Test Market Selection Framework¶
Core Objective¶
Identify treatment markets that maximize:
Statistical sensitivity (lowest minimum detectable effect)
Causal inference validity (parallel trends assumption)
ROI measurement precision (narrowest confidence intervals)
Statistical Market Selection Process¶
GeoLift-SDID employs matrix-based causal inference algorithms that require specific market characteristics for optimal performance. The following process systematically identifies these markets.
Implementation Protocol¶
1. Data Engineering Requirements¶
python recipes/data_validator.py --data historical_sales.csv --min-periods 90 --check-balance --check-stationarity
Critical constraints:
Pre-period data: 90+ days (minimum for reliable similarity assessment)
Panel structure: Balanced matrix with no missing values
Stationarity: ADF test p-value < 0.05 for all time series
Variance threshold: Coefficient of variation < 0.5 per market
2. Statistical Similarity Calculation¶
python recipes/donor_evaluator.py --config configs/donor_eval_config_generic.yaml --output-dir outputs/similarity_analysis --evaluate-all-pairs
Technical outputs:
similarity_matrix.csv: N×N correlation coefficients between all market pairsrmse_matrix.csv: Root Mean Squared Error for level matching assessmentmape_matrix.csv: Mean Absolute Percentage Error for relative deviationdtw_matrix.csv: Dynamic Time Warping distances for pattern matching
Critical metrics for validation:
Correlation coefficient > 0.7 (strong trend similarity)
MAPE < 0.15 (less than 15% average deviation)
Minimum of 10 high-quality donors per potential treatment unit
3. Statistical Power Optimization¶
python recipes/power_calculator.py --mode market_selection --min-power 0.8 --effect-size 0.1 --duration 30 --alpha 0.05 --bootstrap 1000 --output-dir outputs/market_power
Optimization algorithm outputs:
MDE (Minimum Detectable Effect) for each potential treatment market
Power curves showing sensitivity across effect sizes (0.05-0.20)
Required sample sizes for statistical significance
Donor pool quality scores per treatment candidate
Sensitivity validation thresholds:
MDE threshold: < 0.10 (can detect 10% lift)
Power at expected effect: > 0.8 (80% chance of detecting true effect)
Type I error rate: 0.05 (5% false positive probability)
Type II error rate: < 0.2 (< 20% false negative probability)
4. Market Selection Decision Framework¶
Primary selection algorithm (statistical validity):
def rank_treatment_candidates(power_analysis_results, donor_quality_matrix, weights=None):
"""
Rank markets by statistical detection power weighted by donor quality
Parameters:
-----------
power_analysis_results : DataFrame
Contains MDE and power metrics for each potential treatment unit
donor_quality_matrix : DataFrame
Contains donor pool quality metrics for each potential treatment unit
weights : dict, optional
Weighting parameters for different factors, defaults to equal weighting
Returns:
--------
DataFrame
Ranked markets with composite scores
"""
if weights is None:
weights = {
'mde': 0.5, # Lower is better
'donor_quality': 0.3, # Higher is better
'stability': 0.2 # Lower coefficient of variation is better
}
# Normalize metrics to [0,1] scale
normalized = {}
# MDE - lower is better
mde_min = power_analysis_results['mde'].min()
mde_max = power_analysis_results['mde'].max()
normalized['mde'] = 1 - (power_analysis_results['mde'] - mde_min) / (mde_max - mde_min)
# Donor quality - higher is better
dq_min = donor_quality_matrix['composite_score'].min()
dq_max = donor_quality_matrix['composite_score'].max()
normalized['donor_quality'] = (donor_quality_matrix['composite_score'] - dq_min) / (dq_max - dq_min)
# Stability - lower coefficient of variation is better
cv_min = power_analysis_results['coefficient_of_variation'].min()
cv_max = power_analysis_results['coefficient_of_variation'].max()
normalized['stability'] = 1 - (power_analysis_results['coefficient_of_variation'] - cv_min) / (cv_max - cv_min)
# Calculate composite score
composite_scores = (
weights['mde'] * normalized['mde'] +
weights['donor_quality'] * normalized['donor_quality'] +
weights['stability'] * normalized['stability']
)
# Create ranked dataframe
ranked_markets = pd.DataFrame({
'market': power_analysis_results.index,
'composite_score': composite_scores,
'mde': power_analysis_results['mde'],
'donor_quality': donor_quality_matrix['composite_score'],
'stability': power_analysis_results['coefficient_of_variation']
}).sort_values('composite_score', ascending=False)
return ranked_markets
Secondary validation criteria (business relevance):
Apply non-statistical constraints after statistical ranking:
Market materiality threshold:
Revenue contribution > 2% of total business
Customer base > 1% of total customer base
Sufficient media delivery capacity
Confounding factor exclusion:
No planned promotions during test period
No recent major competitive disruptions
No anticipated supply chain/distribution changes
No seasonal anomalies specific to market
Media efficiency analysis:
CPM/CPC within 15% of network average
Delivery forecasts meeting minimum thresholds
Creative relevance verification
5. Implementation Execution¶
python recipes/market_analyzer.py --input power_analysis_results.csv --donor-matrix donor_quality_matrix.csv --revenue-data market_revenue.csv --output selected_markets.yaml --treatment-count 3 --weights "mde:0.5,donor_quality:0.3,stability:0.2"
Automated execution steps:
Applies ranking algorithm with specified weights
Filters markets by minimal statistical requirements
Applies business constraint validation
Generates final market selection with justification metrics
Creates configuration files for experiment design
Implementation Validation Process¶
After market selection, validate the selection quality:
Statistical verification:
Re-run power analysis with selected markets only
Perform synthetic simulations with known effect sizes
Validate error rates through Monte Carlo simulation
Sensitivity analysis:
Test alternative market selections
Quantify potential bias from selection criteria
Calculate confidence intervals for detection power
Documentation requirements:
Statistical justification for each selected market
Predicted MDE and confidence intervals
Required test duration for reliable inference
Synthetic control quality metrics
Execution Integration¶
Upon market selection finalization:
Create experiment design document with selected markets
Prepare measurement plan with detailed metrics
Configure campaign execution parameters
Implement test/control setup in ad platforms
Establish automated data collection pipeline
Direct activation path: proceed to campaign execution and causal measurement workflow.