Chi-square and normal inference in high-dimensional multi-task regression

07/16/2021
by   Pierre C. Bellec, et al.
5

The paper proposes chi-square and normal inference methodologies for the unknown coefficient matrix B^* of size p× T in a Multi-Task (MT) linear model with p covariates, T tasks and n observations under a row-sparse assumption on B^*. The row-sparsity s, dimension p and number of tasks T are allowed to grow with n. In the high-dimensional regime p n, in order to leverage row-sparsity, the MT Lasso is considered. We build upon the MT Lasso with a de-biasing scheme to correct for the bias induced by the penalty. This scheme requires the introduction of a new data-driven object, coined the interaction matrix, that captures effective correlations between noise vector and residuals on different tasks. This matrix is psd, of size T× T and can be computed efficiently. The interaction matrix lets us derive asymptotic normal and χ^2_T results under Gaussian design and sT+slog(p/s)/n→0 which corresponds to consistency in Frobenius norm. These asymptotic distribution results yield valid confidence intervals for single entries of B^* and valid confidence ellipsoids for single rows of B^*, for both known and unknown design covariance Σ. While previous proposals in grouped-variables regression require row-sparsity s≲√(n) up to constants depending on T and logarithmic factors in n,p, the de-biasing scheme using the interaction matrix provides confidence intervals and χ^2_T confidence ellipsoids under the conditions min(T^2,log^8p)/n→ 0 and sT+slog(p/s)+Σ^-1e_j_0log p/n→0, min(s,Σ^-1e_j_0)/√(n)√([T+log(p/s)]log p)→ 0, allowing row-sparsity s√(n) when Σ^-1e_j_0 √(T)√(n) up to logarithmic factors.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset