Log In Sign Up

Momentum Accelerates the Convergence of Stochastic AUPRC Maximization

by   Guanghui Wang, et al.

In this paper, we study stochastic optimization of areas under precision-recall curves (AUPRC), which is widely used for combating imbalanced classification tasks. Although a few methods have been proposed for maximizing AUPRC, stochastic optimization of AUPRC with convergence guarantee remains an undeveloped territory. A recent work [42] has proposed a promising approach towards AUPRC based on maximizing a surrogate loss for the average precision, and proved an O(1/ϵ^5) complexity for finding an ϵ-stationary solution of the non-convex objective. In this paper, we further improve the stochastic optimization of AURPC by (i) developing novel stochastic momentum methods with a better iteration complexity of O(1/ϵ^4) for finding an ϵ-stationary solution; and (ii) designing a novel family of stochastic adaptive methods with the same iteration complexity of O(1/ϵ^4), which enjoy faster convergence in practice. To this end, we propose two innovative techniques that are critical for improving the convergence: (i) the biased estimators for tracking individual ranking scores are updated in a randomized coordinate-wise manner; and (ii) a momentum update is used on top of the stochastic gradient estimator for tracking the gradient of the objective. Extensive experiments on various data sets demonstrate the effectiveness of the proposed algorithms. Of independent interest, the proposed stochastic momentum and adaptive algorithms are also applicable to a class of two-level stochastic dependent compositional optimization problems.


page 1

page 2

page 3

page 4


Last-iterate convergence analysis of stochastic momentum methods for neural networks

The stochastic momentum method is a commonly used acceleration technique...

On the Convergence of Weighted AdaGrad with Momentum for Training Deep Neural Networks

Adaptive stochastic gradient descent methods, such as AdaGrad, RMSProp, ...

Stochastic Optimization of Area Under Precision-Recall Curve for Deep Learning with Provable Convergence

Areas under ROC (AUROC) and precision-recall curves (AUPRC) are common m...

Large-scale Stochastic Optimization of NDCG Surrogates for Deep Learning with Provable Convergence

NDCG, namely Normalized Discounted Cumulative Gain, is a widely used ran...

Accelerated Randomized Coordinate Descent Methods for Stochastic Optimization and Online Learning

We propose accelerated randomized coordinate descent algorithms for stoc...

Optimal Design of Queuing Systems via Compositional Stochastic Programming

Well-designed queuing systems form the backbone of modern communications...

Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization

We introduce MADGRAD, a novel optimization method in the family of AdaGr...