Abstract: Marketing research frequently studies difference-in-differences designs where subjects self-select into treatment. This is vulnerable to bias when unobserved individual characteristics that influence both selection and outcomes lead to violation of the parallel trends assumption. Negative control variables are defined as variables that either do not affect the outcome or are not influenced by the treatment of interest. Building on proximal causal inference, the authors propose a semiparametric doubly-robust estimator that incorporates a pair of negative control variables to correct for bias from omitted covariates.
The proposed estimator is well-suited for staggered adoption and cohort heterogeneity. To demonstrate the practical value of the approach, the authors analyze the impact of a Buy-Now-Pay-Later program on an online food ordering and delivery platform, using a longitudinal dataset of over 115,000 users. Unobserved disposable income may influence both self-selection into the program and users’ spending. Leveraging discount incentives usage and district-level house prices as negative control variables enables adjustment for this potential source of endogeneity. The results show that the standard two-way fixed effects estimator overestimates the average treatment effect on the treated by 28.4%, with as much as 42.6% among early adopters. An R package ProxDiD supports implementation of the method.