Inference in High-dimensional Multivariate Response Regression with Hidden Variables
This paper studies the inference of the regression coefficient matrix under multivariate response linear regressions in the presence of hidden variables. A novel procedure for constructing confidence intervals of entries of the coefficient matrix is proposed. Our method first utilizes the multivariate nature of the responses by estimating and adjusting the hidden effect to construct an initial estimator of the coefficient matrix. By further deploying a low-dimensional projection procedure to reduce the bias introduced by the regularization in the previous step, a refined estimator is proposed and shown to be asymptotically normal. The asymptotic variance of the resulting estimator is derived with closed-form expression and can be consistently estimated. In addition, we propose a testing procedure for the existence of hidden effects and provide its theoretical justification. Both our procedures and their analyses are valid even when the feature dimension and the number of responses exceed the sample size. Our results are further backed up via extensive simulations and a real data analysis.
READ FULL TEXT