Rodeo: Sparse Nonparametric Regression in High Dimensions

John Lafferty and Larry Wasserman


We present a method for simultaneously performing bandwidth selection and variable selection in nonparametric regression. The method starts with a local linear estimator with large bandwidths, and incrementally decreases the bandwidth in directions where the gradient of the estimator with respect to bandwidth is large. When the unknown function satisfies a sparsity condition, the approach avoids the curse of dimensionality. The method - called rodeo (regularization of derivative expectation operator) - conducts a sequence of hypothesis tests, and is easy to implement. A modified version that replaces testing with soft thresholding may be viewed as solving a sequence of lasso problems. When applied in one dimension, the rodeo yields a method for choosing the locally optimal bandwidth.

Keywords: Nonparametric regression, sparsity, local linear smoothing, adaptive estimation, bandwidth estimation, variable selection.

Heidi Sestrich 2005-06-20
Here is the full PDF text for this technical report. It is 355974 bytes long.