A Generalized Least Squares Matrix Decomposition

02/15/2011
by   Genevera I. Allen, et al.
0

Variables in many massive high-dimensional data sets are structured, arising for example from measurements on a regular grid as in imaging and time series or from spatial-temporal measurements as in climate studies. Classical multivariate techniques ignore these structural relationships often resulting in poor performance. We propose a generalization of the singular value decomposition (SVD) and principal components analysis (PCA) that is appropriate for massive data sets with structured variables or known two-way dependencies. By finding the best low rank approximation of the data with respect to a transposable quadratic norm, our decomposition, entitled the Generalized least squares Matrix Decomposition (GMD), directly accounts for structural relationships. As many variables in high-dimensional settings are often irrelevant or noisy, we also regularize our matrix decomposition by adding two-way penalties to encourage sparsity or smoothness. We develop fast computational algorithms using our methods to perform generalized PCA (GPCA), sparse GPCA, and functional GPCA on massive data sets. Through simulations and a whole brain functional MRI example we demonstrate the utility of our methodology for dimension reduction, signal recovery, and feature selection with high-dimensional structured data.

READ FULL TEXT
research
09/11/2013

Sparse and Functional Principal Components Analysis

Regularized principal components analysis, especially Sparse PCA and Fun...
research
02/11/2012

Regularized Tensor Factorizations and Higher-Order Principal Components Analysis

High-dimensional tensors or multi-way data are becoming prevalent in are...
research
10/25/2021

Fast estimation method for rank of a high-dimensional sparse matrix

Numerical computing the rank of a matrix is a fundamental problem in sci...
research
04/16/2021

Generalized Matrix Decomposition Regression: Estimation and Inference for Two-way Structured Data

This paper studies high-dimensional regression with two-way structured d...
research
10/23/2017

SMSSVD - SubMatrix Selection Singular Value Decomposition

High throughput biomedical measurements normally capture multiple overla...
research
03/19/2016

L0-norm Sparse Graph-regularized SVD for Biclustering

Learning the "blocking" structure is a central challenge for high dimens...
research
04/17/2012

Regularized Partial Least Squares with an Application to NMR Spectroscopy

High-dimensional data common in genomics, proteomics, and chemometrics o...

Please sign up or login with your details

Forgot password? Click here to reset