Scalable High-Dimensional Multivariate Linear Regression for Feature-Distributed Data

07/07/2023
by   Shuo-Chieh Huang, et al.
0

Feature-distributed data, referred to data partitioned by features and stored across multiple computing nodes, are increasingly common in applications with a large number of features. This paper proposes a two-stage relaxed greedy algorithm (TSRGA) for applying multivariate linear regression to such data. The main advantage of TSRGA is that its communication complexity does not depend on the feature dimension, making it highly scalable to very large data sets. In addition, for multivariate response variables, TSRGA can be used to yield low-rank coefficient estimates. The fast convergence of TSRGA is validated by simulation experiments. Finally, we apply the proposed TSRGA in a financial application that leverages unstructured data from the 10-K reports, demonstrating its usefulness in applications with many dense large-dimensional matrices.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/21/2017

High-Dimensional Multivariate Posterior Consistency Under Global-Local Shrinkage Priors

We consider sparse Bayesian estimation in the classical multivariate lin...
research
06/10/2017

Stepwise regression for unsupervised learning

I consider unsupervised extensions of the fast stepwise linear regressio...
research
01/11/2023

Multivariate Regression via Enhanced Response Envelope: Envelope Regularization and Double Descent

The envelope model provides substantial efficiency gains over the standa...
research
01/28/2022

Low-rank features based double transformation matrices learning for image classification

Linear regression is a supervised method that has been widely used in cl...
research
06/16/2020

Risk bounds when learning infinitely many response functions by ordinary linear regression

Consider the problem of learning a large number of response functions si...
research
07/15/2019

A Stratification Approach to Partial Dependence for Codependent Variables

Model interpretability is important to machine learning practitioners, a...
research
12/17/2021

Supervised Multivariate Learning with Simultaneous Feature Auto-grouping and Dimension Reduction

Modern high-dimensional methods often adopt the "bet on sparsity" princi...

Please sign up or login with your details

Forgot password? Click here to reset