Generalized Matrix Decomposition Regression: Estimation and Inference for Two-way Structured Data

by   Yue Wang, et al.

This paper studies high-dimensional regression with two-way structured data. To estimate the high-dimensional coefficient vector, we propose the generalized matrix decomposition regression (GMDR) to efficiently leverage any auxiliary information on row and column structures. The GMDR extends the principal component regression (PCR) to two-way structured data, but unlike PCR, the GMDR selects the components that are most predictive of the outcome, leading to more accurate prediction. For inference on regression coefficients of individual variables, we propose the generalized matrix decomposition inference (GMDI), a general high-dimensional inferential framework for a large family of estimators that include the proposed GMDR estimator. GMDI provides more flexibility for modeling relevant auxiliary row and column structures. As a result, GMDI does not require the true regression coefficients to be sparse; it also allows dependent and heteroscedastic observations. We study the theoretical properties of GMDI in terms of both the type-I error rate and power and demonstrate the effectiveness of GMDR and GMDI on simulation studies and an application to human microbiome data.


page 1

page 2

page 3

page 4


Matrix Kendall's tau in High-dimensions: A Robust Statistic for Matrix Factor Model

In this article, we first propose generalized row/column matrix Kendall'...

Noise Covariance Estimation in Multi-Task High-dimensional Linear Models

This paper studies the multi-task high-dimensional linear regression mod...

A Generalized Least Squares Matrix Decomposition

Variables in many massive high-dimensional data sets are structured, ari...

Inference in generalized bilinear models

Latent factor models are widely used to discover and adjust for hidden v...

Canonical thresholding for non-sparse high-dimensional linear regression

We consider a high-dimensional linear regression problem. Unlike many pa...

Generalized Co-sparse Factor Regression

Multivariate regression techniques are commonly applied to explore the a...

Estimation and Inference with Proxy Data and its Genetic Applications

Existing high-dimensional statistical methods are largely established fo...