DeepAI AI Chat
Log In Sign Up

Proximal SCOPE for Distributed Sparse Learning: Better Data Partition Implies Faster Convergence Rate

by   Shen-Yi Zhao, et al.
Nanjing University

Distributed sparse learning with a cluster of multiple machines has attracted much attention in machine learning, especially for large-scale applications with high-dimensional data. One popular way to implement sparse learning is to use L_1 regularization. In this paper, we propose a novel method, called proximal SCOPE (pSCOPE), for distributed sparse learning with L_1 regularization. pSCOPE is based on a cooperative autonomous local learning (CALL) framework. In the CALL framework of pSCOPE, we find that the data partition affects the convergence of the learning procedure, and subsequently we define a metric to measure the goodness of a data partition. Based on the defined metric, we theoretically prove that pSCOPE is convergent with a linear convergence rate if the data partition is good enough. We also prove that better data partition implies faster convergence rate. Furthermore, pSCOPE is also communication efficient. Experimental results on real data sets show that pSCOPE can outperform other state-of-the-art distributed methods for sparse learning.


page 1

page 2

page 3

page 4


SCOPE: Scalable Composite Optimization for Learning on Spark

Many machine learning models, such as logistic regression (LR) and suppo...

Feature-Distributed SVRG for High-Dimensional Linear Classification

Linear classification has been widely used in many high-dimensional appl...

Convergence Analysis of Distributed Stochastic Gradient Descent with Shuffling

When using stochastic gradient descent to solve large-scale machine lear...

A Fast Distributed Proximal-Gradient Method

We present a distributed proximal-gradient method for optimizing the ave...

Communication-efficient Algorithm for Distributed Sparse Learning via Two-way Truncation

We propose a communicationally and computationally efficient algorithm f...

Gradient Boosted Binary Histogram Ensemble for Large-scale Regression

In this paper, we propose a gradient boosting algorithm for large-scale ...

Estimating Networks With Jumps

We study the problem of estimating a temporally varying coefficient and ...