DeepAI AI Chat
Log In Sign Up

Parallel and Communication Avoiding Least Angle Regression

05/27/2019
by   S. Das, et al.
University of Waterloo
Inria
berkeley college
0

We are interested in parallelizing the Least Angle Regression (LARS) algorithm for fitting linear regression models to high-dimensional data. We consider two parallel and communication avoiding versions of the basic LARS algorithm. The two algorithms apply to data that have different layout patterns (one is appropriate for row-partitioned data, and the other is appropriate for column-partitioned data), and they have different asymptotic costs and practical performance. The first is bLARS, a block version of LARS algorithm where we update b columns at each iteration. Assuming that the data are row-partitioned, bLARS reduces the number of arithmetic operations, latency, and bandwidth by a factor of b. The second is Tournament-bLARS (T-bLARS), a tournament version of LARS, in which case processors compete, by running several LARS computations in parallel, to choose b new columns to be added into the solution. Assuming that the data are column-partitioned, T-bLARS reduces latency by a factor of b. Similarly to LARS, our proposed methods generate a sequence of linear models. We present extensive numerical experiments that illustrate speed-ups up to 25x compared to LARS.

READ FULL TEXT

page 14

page 15

page 16

03/14/2021

A two-way factor model for high-dimensional matrix data

In this article, we introduce a two-way factor model for a high-dimensio...
12/04/2021

On the stability and performance of the solution of sparse linear systems by partitioned procedures

In this paper, we present, evaluate and analyse the performance of paral...
05/14/2018

A 3D Parallel Algorithm for QR Decomposition

Interprocessor communication often dominates the runtime of large matrix...
08/11/2021

Retrieval Interaction Machine for Tabular Data Prediction

Prediction over tabular data is an essential task in many data science a...
07/20/2018

An Improved Speedup Factor for Sporadic Tasks with Constrained Deadlines under Dynamic Priority Scheduling

Schedulability is a fundamental problem in real-time scheduling, but it ...