Computes bandwidth(s) for local polynomial estimation in a regression discontinuity setting with distribution-valued outcomes. Implements a three-step pilot procedure to find either MSE-optimal (per-quantile) bandwidths or a single IMSE-optimal bandwidth, depending on method. For details, see the Appendix of Van Dijcke (2025) .

r3d_bwselect(
  X,
  Y_list,
  T = NULL,
  q_grid = seq(0.1, 0.9, 0.1),
  method = c("simple", "frechet"),
  s = 1,
  p = 2,
  kernel = function(u) pmax(0, 1 - abs(u)),
  cutoff = 0,
  fuzzy = FALSE,
  coverage = FALSE,
  ...
)

Arguments

X

Numeric vector of the running variable.

Y_list

A list of numeric vectors; each entry is the sample of outcomes from one unit's distribution.

T

(Optional) Numeric or logical vector of treatment statuses for fuzzy design.

q_grid

Numeric vector of quantiles at which local polynomial fits are performed.

method

Either "simple" (per-quantile MSE-optimal) or "frechet" (single IMSE-optimal bandwidth).

s

Integer specifying the order of local polynomial in the pilot stage (often 1).

p

Integer specifying the final local polynomial order (often 2).

kernel

Kernel function for local weighting. Defaults to triangular kernel.

cutoff

Numeric scalar threshold. Data are recentered so X - cutoff has cutoff at 0.

fuzzy

Logical indicating fuzzy design. Default is FALSE.

coverage

Logical indicating whether to apply the coverage correction rule of thumb of Calonico et al. (2018) . Default is FALSE.

...

Additional arguments for future expansions.

Value

A list with elements:

method

Method used: "simple" or "frechet".

q_grid

Input q_grid.

h_star_num

Bandwidth(s) for numerator (outcome).

h_star_den

Bandwidth for denominator (treatment, if fuzzy).

pilot_h_num

Pilot bandwidth(s) for numerator.

pilot_h_den

Pilot bandwidth for denominator (if fuzzy).

s, p

Polynomial orders.

B_plus, B_minus

Bias estimates for numerator.

V_plus, V_minus

Variance estimates for numerator.

f_X_hat

Estimated density of \(X\) at cutoff.

Details

Implements a three-step procedure:

  1. Estimates \(f_X(0)\) using Silverman’s rule and computes pilot bandwidths via global polynomials.

  2. Runs pilot local polynomial regressions to estimate bias and variance.

  3. Computes MSE-optimal (per-quantile) or IMSE-optimal (single) bandwidths.

In fuzzy RDD, separate bandwidths are computed for the numerator (outcome) and denominator (treatment).

References

Calonico S, Cattaneo MD, Farrell MH (2018). “On the effect of bias estimation on coverage accuracy in nonparametric inference.” Journal of the American Statistical Association, 113(522), 767–779.

Van Dijcke D (2025). “Regression Discontinuity Design with Distribution-Valued Outcomes.” Working paper.