Gradient-Based Empirical Risk Minimization using Local Polynomial Regression

11/04/2020
by   Ali Jadbabaie, et al.
8

In this paper, we consider the problem of empirical risk minimization (ERM) of smooth, strongly convex loss functions using iterative gradient-based methods. A major goal of this literature has been to compare different algorithms, such as gradient descent (GD) or stochastic gradient descent (SGD), by analyzing their rates of convergence to ϵ-approximate solutions. For example, the oracle complexity of GD is O(nlog(ϵ^-1)), where n is the number of training samples. When n is large, this can be expensive in practice, and SGD is preferred due to its oracle complexity of O(ϵ^-1). Such standard analyses only utilize the smoothness of the loss function in the parameter being optimized. In contrast, we demonstrate that when the loss function is smooth in the data, we can learn the oracle at every iteration and beat the oracle complexities of both GD and SGD in important regimes. Specifically, at every iteration, our proposed algorithm performs local polynomial regression to learn the gradient of the loss function, and then estimates the true gradient of the ERM objective function. We establish that the oracle complexity of our algorithm scales like Õ((p ϵ^-1)^d/(2η)) (neglecting sub-dominant factors), where d and p are the data and parameter space dimensions, respectively, and the gradient of the loss function belongs to a η-Hölder class with respect to the data. Our proof extends the analysis of local polynomial regression in non-parametric statistics to provide interpolation guarantees in multivariate settings, and also exploits tools from the inexact GD literature. Unlike GD and SGD, the complexity of our method depends on d and p. However, when d is small and the loss function exhibits modest smoothness in the data, our algorithm beats GD and SGD in oracle complexity for a very broad range of p and ϵ.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

04/16/2022

On Acceleration of Gradient-Based Empirical Risk Minimization using Local Polynomial Regression

We study the acceleration of the Local Polynomial Interpolation-based Gr...
01/06/2022

Federated Optimization of Smooth Loss Functions

In this work, we study empirical risk minimization (ERM) within a federa...
04/08/2022

Decision-Dependent Risk Minimization in Geometrically Decaying Dynamic Environments

This paper studies the problem of expected loss minimization given a dat...
06/24/2016

Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles

Many practical perception systems exist within larger processes that inc...
10/03/2020

Practical Precoding via Asynchronous Stochastic Successive Convex Approximation

We consider stochastic optimization of a smooth non-convex loss function...
07/01/2020

Online Robust Regression via SGD on the l1 loss

We consider the robust linear regression problem in the online setting w...
01/09/2020

How to trap a gradient flow

We consider the problem of finding an ε-approximate stationary point of ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.