A Stochastic First-Order Method for Ordered Empirical Risk Minimization

07/09/2019
by   Kenji Kawaguchi, et al.
5

We propose a new stochastic first-order method for empirical risk minimization problems such as those that arise in machine learning. The traditional approaches, such as (mini-batch) stochastic gradient descent (SGD), utilize an unbiased gradient estimator of the empirical average loss. In contrast, we develop a computationally efficient method to construct a gradient estimator that is purposely biased toward those observations with higher current losses, and that itself is an unbiased gradient estimator of an ordered modification of the empirical average loss. On the theory side, we show that the proposed algorithm is guaranteed to converge at a sublinear rate to a global optimum for convex loss and to a critical point for non-convex loss. Furthermore, we prove a new generalization bound for the proposed algorithm. On the empirical side, we present extensive numerical experiments, in which our proposed method consistently improves the test errors compared with the standard mini-batch SGD in various models including SVM, logistic regression, and (non-convex) deep learning problems.

READ FULL TEXT
research
09/06/2019

Decentralized Stochastic Gradient Tracking for Non-convex Empirical Risk Minimization

This paper studies a decentralized stochastic gradient tracking (DSGT) a...
research
11/28/2018

First-order Newton-type Estimator for Distributed Estimation and Inference

This paper studies distributed estimation and inference for a general st...
research
09/06/2019

Decentralized Stochastic Gradient Tracking for Empirical Risk Minimization

Recent works have shown superiorities of decentralized SGD to centralize...
research
01/27/2023

Meta-Learning Mini-Batch Risk Functionals

Supervised learning typically optimizes the expected value risk function...
research
06/07/2015

Primal Method for ERM with Flexible Mini-batching Schemes and Non-convex Losses

In this work we develop a new algorithm for regularized empirical risk m...
research
08/30/2022

Using Taylor-Approximated Gradients to Improve the Frank-Wolfe Method for Empirical Risk Minimization

The Frank-Wolfe method has become increasingly useful in statistical and...
research
06/02/2023

Towards Sustainable Learning: Coresets for Data-efficient Deep Learning

To improve the efficiency and sustainability of learning deep models, we...

Please sign up or login with your details

Forgot password? Click here to reset