A Class of Parallel Doubly Stochastic Algorithms for Large-Scale Learning

06/15/2016
by   Aryan Mokhtari, et al.
0

We consider learning problems over training sets in which both, the number of training examples and the dimension of the feature vectors, are large. To solve these problems we propose the random parallel stochastic algorithm (RAPSA). We call the algorithm random parallel because it utilizes multiple parallel processors to operate on a randomly chosen subset of blocks of the feature vector. We call the algorithm stochastic because processors choose training subsets uniformly at random. Algorithms that are parallel in either of these dimensions exist, but RAPSA is the first attempt at a methodology that is parallel in both the selection of blocks and the selection of elements of the training set. In RAPSA, processors utilize the randomly chosen functions to compute the stochastic gradient component associated with a randomly chosen block. The technical contribution of this paper is to show that this minimally coordinated algorithm converges to the optimal classifier when the training objective is convex. Moreover, we present an accelerated version of RAPSA (ARAPSA) that incorporates the objective function curvature information by premultiplying the descent direction by a Hessian approximation matrix. We further extend the results for asynchronous settings and show that if the processors perform their updates without any coordination the algorithms are still convergent to the optimal argument. RAPSA and its extensions are then numerically evaluated on a linear estimation problem and a binary image classification task using the MNIST handwritten digit dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2015

HAMSI: A Parallel Incremental Optimization Algorithm Using Quadratic Approximations for Solving Partially Separable Problems

We propose HAMSI (Hessian Approximated Multiple Subsets Iteration), whic...
research
12/04/2012

Parallel Coordinate Descent Methods for Big Data Optimization

In this work we show that randomized (block) coordinate descent methods ...
research
01/29/2014

RES: Regularized Stochastic BFGS Algorithm

RES, a regularized stochastic version of the Broyden-Fletcher-Goldfarb-S...
research
08/15/2018

An Analysis of Asynchronous Stochastic Accelerated Coordinate Descent

Gradient descent, and coordinate descent in particular, are core tools i...
research
01/19/2019

Fitting ReLUs via SGD and Quantized SGD

In this paper we focus on the problem of finding the optimal weights of ...
research
11/13/2018

Parallel Stochastic Asynchronous Coordinate Descent: Tight Bounds on the Possible Parallelism

Several works have shown linear speedup is achieved by an asynchronous p...
research
11/27/2019

A Most Irrational Foraging Algorithm

We present a foraging algorithm, GoldenFA, in which search direction is ...

Please sign up or login with your details

Forgot password? Click here to reset