Learning Rates for Nonconvex Pairwise Learning

11/09/2021
by   Shaojie Li, et al.
0

Pairwise learning is receiving increasing attention since it covers many important machine learning tasks, e.g., metric learning, AUC maximization, and ranking. Investigating the generalization behavior of pairwise learning is thus of significance. However, existing generalization analysis mainly focuses on the convex objective functions, leaving the nonconvex learning far less explored. Moreover, the current learning rates derived for generalization performance of pairwise learning are mostly of slower order. Motivated by these problems, we study the generalization performance of nonconvex pairwise learning and provide improved learning rates. Specifically, we develop different uniform convergence of gradients for pairwise learning under different assumptions, based on which we analyze empirical risk minimizer, gradient descent, and stochastic gradient descent pairwise learning. We first successfully establish learning rates for these algorithms in a general nonconvex setting, where the analysis sheds insights on the trade-off between optimization and generalization and the role of early-stopping. We then investigate the generalization performance of nonconvex learning with a gradient dominance curvature condition. In this setting, we derive faster learning rates of order 𝒪(1/n), where n is the sample size. Provided that the optimal population risk is small, we further improve the learning rates to 𝒪(1/n^2), which, to the best of our knowledge, are the first 𝒪(1/n^2)-type of rates for pairwise learning, no matter of convex or nonconvex learning. Overall, we systematically analyzed the generalization performance of nonconvex pairwise learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/19/2021

Improved Learning Rates for Stochastic Optimization: Two Theoretical Viewpoints

Generalization performance of stochastic optimization stands a central p...
research
02/03/2019

Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions

Stochastic gradient descent (SGD) is a popular and efficient method with...
research
11/23/2021

Simple Stochastic and Online Gradient Descent Algorithms for Pairwise Learning

Pairwise learning refers to learning tasks where the loss function depen...
research
08/08/2022

Pairwise Learning via Stagewise Training in Proximal Setting

The pairwise objective paradigms are an important and essential aspect o...
research
04/25/2019

Stability and Optimization Error of Stochastic Gradient Descent for Pairwise Learning

In this paper we study the stability and its trade-off with optimization...
research
02/25/2015

Online Pairwise Learning Algorithms with Kernels

Pairwise learning usually refers to a learning task which involves a los...
research
02/20/2023

Stability-based Generalization Analysis for Mixtures of Pointwise and Pairwise Learning

Recently, some mixture algorithms of pointwise and pairwise learning (PP...

Please sign up or login with your details

Forgot password? Click here to reset