Linear Regression with Distributed Learning: A Generalization Error Perspective

01/22/2021
by   Martin Hellkvist, et al.
0

Distributed learning provides an attractive framework for scaling the learning task by sharing the computational load over multiple nodes in a network. Here, we investigate the performance of distributed learning for large-scale linear regression where the model parameters, i.e., the unknowns, are distributed over the network. We adopt a statistical learning approach. In contrast to works that focus on the performance on the training data, we focus on the generalization error, i.e., the performance on unseen data. We provide high-probability bounds on the generalization error for both isotropic and correlated Gaussian data as well as sub-gaussian data. These results reveal the dependence of the generalization performance on the partitioning of the model over the network. In particular, our results show that the generalization error of the distributed solution can be substantially higher than that of the centralized solution even when the error on the training data is at the same level for both the centralized and distributed approaches. Our numerical results illustrate the performance with both real-world image data as well as synthetic data.

READ FULL TEXT
research
04/30/2020

Generalization Error for Linear Regression under Distributed Learning

Distributed learning facilitates the scaling-up of data processing by di...
research
09/18/2023

Multi-dimensional domain generalization with low-rank structures

In conventional statistical and machine learning methods, it is typicall...
research
04/09/2023

Theoretical Characterization of the Generalization Performance of Overfitted Meta-Learning

Meta-learning has arisen as a successful method for improving training p...
research
09/02/2017

Adaptive Scaling

Preprocessing data is an important step before any data analysis. In thi...
research
01/04/2018

Prediction Error Bounds for Linear Regression With the TREX

The TREX is a recently introduced approach to sparse linear regression. ...
research
09/30/2018

Distributed linear regression by averaging

Modern massive datasets pose an enormous computational burden to practit...
research
01/09/2023

Distributed Sparse Linear Regression under Communication Constraints

In multiple domains, statistical tasks are performed in distributed sett...

Please sign up or login with your details

Forgot password? Click here to reset