Communication-Efficient Distributed Optimization of Self-Concordant Empirical Loss

01/01/2015
by   Yuchen Zhang, et al.
0

We consider distributed convex optimization problems originated from sample average approximation of stochastic optimization, or empirical risk minimization in machine learning. We assume that each machine in the distributed computing system has access to a local empirical loss function, constructed with i.i.d. data sampled from a common distribution. We propose a communication-efficient distributed algorithm to minimize the overall empirical loss, which is the average of the local empirical losses. The algorithm is based on an inexact damped Newton method, where the inexact Newton steps are computed by a distributed preconditioned conjugate gradient method. We analyze its iteration complexity and communication efficiency for minimizing self-concordant empirical loss functions, and discuss the results for distributed ridge regression, logistic regression and binary classification with a smoothed hinge loss. In a standard setting for supervised learning, the required number of communication rounds of the algorithm does not increase with the sample size, and only grows slowly with the number of machines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2021

A Stochastic Newton Algorithm for Distributed Convex Optimization

We propose and analyze a stochastic Newton algorithm for homogeneous dis...
research
07/22/2019

Practical Newton-Type Distributed Learning using Gradient Based Approximations

We study distributed algorithms for expected loss minimization where the...
research
09/30/2020

First-order Optimization for Superquantile-based Supervised Learning

Classical supervised learning via empirical risk (or negative log-likeli...
research
10/26/2018

Efficient Distributed Hessian Free Algorithm for Large-scale Empirical Risk Minimization via Accumulating Sample Strategy

In this paper, we propose a Distributed Accumulated Newton Conjugate gra...
research
09/11/2017

GIANT: Globally Improved Approximate Newton Method for Distributed Optimization

For distributed computing environments, we consider the canonical machin...
research
03/30/2023

Efficient distributed representations beyond negative sampling

This article describes an efficient method to learn distributed represen...
research
05/24/2017

Learning with Average Top-k Loss

In this work, we introduce the average top-k (AT_k) loss as a new ensemb...

Please sign up or login with your details

Forgot password? Click here to reset