Workflow
Using Post-Double Selection Lasso in Field Experiments
世界银行·2024-09-27 23:03

Industry Research Summary Industry Investment Rating - The report does not provide a specific investment rating for the industry [1][2] Core Findings - Post-double selection Lasso (PDS Lasso) reduces standard errors by less than 1% compared to standard Ancova on average [2] - PDS Lasso does not select variables to model treatment in over half the cases [2] - The method typically selects very few control variables, with a median of 3 controls [9] - In over a quarter of cases, standard errors are slightly larger than with Ancova [9] Methodology and Application - PDS Lasso is commonly used in field experiments with small sample sizes (100-1,000 observations) [7] - The method is particularly relevant for developing countries where survey data often has an average attrition rate of 15% [8] - Researchers typically input a median of 182 controls, but PDS Lasso selects a median of only 3 controls [9] - The treatment regression step is more likely to select control variables when there is attrition [10] Performance Analysis - PDS Lasso leads to minimal changes in treatment estimates, with a median change of 0.01 standard deviations [9] - The median standard error with PDS Lasso is 99.2% of that with Ancova [9] - Cross-validation for selecting the penalty parameter can overfit and result in larger standard errors [11] - PDS Lasso sometimes ends up being less precise than Ancova due to failure to select key variables like the lagged dependent variable [10] Practical Recommendations - Researchers should include the lagged dependent variable in the amelioration set to prevent underfitting [54] - The number of control variables inputted should be judicious, avoiding a "kitchen sink" approach [59] - Missing values in control variables should be carefully handled to avoid sample size reduction [61] - When including treatment interactions, the interacting variable should be included in the amelioration set [75] Limitations and Considerations - PDS Lasso provides minimal power gains on average compared to Ancova [48] - The method is most beneficial when there is differential attrition greater than 5% [50] - Researchers should not anticipate large improvements in power from using PDS Lasso [76] - The double-selection step may not be necessary in many cases, as it often selects variables not strongly correlated with the outcome [69]