Non-parametric Causal Inference in Dynamic Thresholding Designs
Abstract
Consider a setting where we regularly monitor patients' fasting blood sugar, and declare them to have prediabetes (and encourage preventative care) if this number crosses a pre-specified threshold. The sharp, threshold-based treatment policy suggests that we should be able to estimate the long-term benefit of this preventative care by comparing the health trajectories of patients with blood sugar measurements right above and below the threshold. A naive regression-discontinuity analysis, however, is not applicable here, as it ignores the temporal dynamics of the problem where, e.g., a patient just below the threshold on one visit may become prediabetic (and receive treatment) following their next visit. Here, we study thresholding designs in general dynamic systems, and show that simple reduced-form characterizations remain available for a relevant causal target, namely a dynamic marginal policy effect at the treatment threshold. We develop a local-linear-regression approach for estimation and inference of this estimand, and demonstrate promise of our approach in numerical experiments.
Summary
This paper addresses the challenge of causal inference in dynamic thresholding designs, where treatment assignment is based on whether a running variable crosses a threshold, and units can be repeatedly eligible for treatment over time. The main research question is how to estimate the causal effect of such dynamic thresholding policies, particularly when standard regression discontinuity (RD) methods are inadequate due to temporal dependencies. The authors propose a non-parametric approach that leverages the framework of Robins' g-formula and insights from reinforcement learning, specifically the policy gradient theorem. They characterize a dynamic marginal policy effect, which represents the cost-benefit tradeoff of infinitesimally lowering the treatment threshold. They develop a twice-discounted local linear regression estimator for both finite and infinite horizon settings, which accounts for the temporal dynamics and carryover effects of treatment. The estimator's asymptotic properties, including consistency and asymptotic normality, are established under certain regularity conditions. Numerical experiments demonstrate the estimator's performance and compare it against simpler baseline methods. The paper's contribution lies in providing a rigorous framework and a practical estimation method for causal inference in dynamic thresholding designs. This is important because many real-world policies, especially in healthcare and public policy, are based on dynamic thresholds, and accurately evaluating their long-term effects requires methods that account for treatment dynamics. The paper bridges the gap between traditional RD designs and the complexities of dynamic systems, offering a valuable tool for policy evaluation in these settings.
Key Insights
- •The paper provides a novel characterization of the RD estimand in dynamic settings as a *marginal policy effect*, representing a cost-benefit analysis of changing the threshold.
- •A key theoretical contribution is *Theorem 2*, which expresses the policy gradient in terms of Q-functions and conditional densities, accounting for carryover effects and dynamic threshold proximity.
- •The proposed *twice-discounted local linear regression estimator* consistently estimates the dynamic marginal policy effect in both finite and infinite horizon settings (Theorems 4 & 8).
- •The estimator achieves the standard *non-parametric rate of n^{-2/5}* for local linear regression-based RD estimators under appropriate smoothness conditions (Theorems 5 & 9).
- •Asymptotic variance is explicitly characterized and shown to aggregate uncertainty across all time periods, weighted by the discount factor (Theorem 5). Furthermore, the variance is shown to scale as O((1-γ)^-1) as γ approaches 1.
- •The paper provides *asymptotically valid confidence intervals* for the dynamic marginal policy effect, enabling statistical inference (Corollary 7 & 11).
- •Numerical experiments demonstrate that standard LLR methods significantly undercover the true dynamic marginal policy effect (Tables 1 & 2), highlighting the need for the proposed method.
Practical Implications
- •The research has direct applications in *evaluating healthcare policies* that use dynamic thresholding rules, such as pre-diabetes diagnosis based on blood sugar levels (as motivated by Iizuka et al. (2021)).
- •*Policy makers and healthcare administrators* can use the proposed methods to assess the long-term cost-benefit tradeoffs of different thresholding policies and optimize treatment strategies.
- •*Practitioners can implement the twice-discounted local linear regression estimator* using readily available statistical software, following the procedures outlined in the paper.
- •*Future research* could focus on extending the methods to fuzzy RD designs, developing data-driven bandwidth selection procedures specific to dynamic thresholding, and investigating the impact of time fixed effects.
- •The framework opens up avenues for *developing optimal dynamic treatment regimes* by leveraging the estimated Q-functions and policy gradients.