Multiple Testing and Variable Selection along Least Angle Regression's path

06/28/2019
by   J. -M. Azaïs, et al.
0

In this article we investigate the outcomes of the standard Least Angle Regression (LAR) algorithm in high dimensions under the Gaussian noise assumption. We give the exact law of the sequence of knots conditional on the sequence of variables entering the model, i.e., the post-selection law of the knots of the LAR. Based on this result, we prove an exact of the False Discovery Rate (FDR) in the orthogonal design case and an exact control of the existence of false negatives in the general design case. First, we build a sequence of testing procedures on the variables entering the model and we give an exact control of the FDR in the orthogonal design case when the noise level can be unknown. Second, we introduce a new exact testing procedure on the existence of false negatives when the noise level can be unknown. This testing procedure can be deployed after any support selection procedure that will produce an estimation of the support (i.e., the indexes of nonzero coefficients) for any designs. The type I error of the test can be exactly controlled as long as the selection procedure follows some elementary hypotheses, referred to as admissible selection procedures. These support selection procedures are such that the estimation of the support is given by the k first variables entering the model where the random variable k is a stopping time. Monte-Carlo simulations and a real data experiment are provided to illustrate our results.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset