Confidence Sets Based on Thresholding Estimators in High-Dimensional Gaussian Regression Models
We study confidence intervals based on hard-thresholding, soft-thresholding, and adaptive soft-thresholding in a linear regression model where the number of regressors k may depend on and diverge with sample size n. In addition to the case of known error variance, we define and study versions of the estimators when the error variance is unknown. In the known variance case, we provide an exact analysis of the coverage properties of such intervals in finite samples. We show that these intervals are always larger than the standard interval based on the least-squares estimator. Asymptotically, the intervals based on the thresholding estimators are larger even by an order of magnitude when the estimators are tuned to perform consistent variable selection. For the unknown-variance case, we provide non-trivial lower bounds for the coverage probabilities in finite samples and conduct an asymptotic analysis where the results from the known-variance case can be shown to carry over asymptotically if the number of degrees of freedom n-k tends to infinity fast enough in relation to the thresholding parameter.
READ FULL TEXT