Online and Stochastic Gradient Methods for Non-decomposable Loss Functions

10/24/2014
by   Purushottam Kar, et al.
0

Modern applications in sensitive domains such as biometrics and medicine frequently require the use of non-decomposable loss functions such as precision@k, F-measure etc. Compared to point loss functions such as hinge-loss, these offer much more fine grained control over prediction, but at the same time present novel challenges in terms of algorithm design and analysis. In this work we initiate a study of online learning techniques for such non-decomposable loss functions with an aim to enable incremental learning as well as design scalable solvers for batch problems. To this end, we propose an online learning framework for such loss functions. Our model enjoys several nice properties, chief amongst them being the existence of efficient online learning algorithms with sublinear regret and online to batch conversion bounds. Our model is a provable extension of existing online learning models for point loss functions. We instantiate two popular losses, prec@k and pAUC, in our model and prove sublinear regret bounds for both of them. Our proofs require a novel structural lemma over ranked lists which may be of independent interest. We then develop scalable stochastic gradient descent solvers for non-decomposable loss functions. We show that for a large family of loss functions satisfying a certain uniform convergence property (that includes prec@k, pAUC, and F-measure), our methods provably converge to the empirical risk minimizer. Such uniform convergence results were not known for these losses and we establish these using novel proof techniques. We then use extensive experimentation on real life and benchmark datasets to establish that our method can be orders of magnitude faster than a recently proposed cutting plane method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/22/2013

Online Learning with Pairwise Loss Functions

Efficient online learning with pairwise loss functions is a crucial comp...
research
05/26/2015

Surrogate Functions for Maximizing Precision at the Top

The problem of maximizing precision at the top of a ranked list, often d...
research
04/22/2016

Approximation Vector Machines for Large-scale Online Learning

One of the most challenging problems in kernel online learning is to bou...
research
05/08/2018

Efficient online learning for large-scale peptide identification

Motivation: Post-database searching is a key procedure in peptide dentif...
research
03/26/2014

Beyond L2-Loss Functions for Learning Sparse Models

Incorporating sparsity priors in learning tasks can give rise to simple,...
research
10/15/2021

Provable Regret Bounds for Deep Online Learning and Control

The use of deep neural networks has been highly successful in reinforcem...
research
05/22/2020

Online Non-convex Learning for River Pollution Source Identification

In this paper, novel gradient based online learning algorithms are devel...

Please sign up or login with your details

Forgot password? Click here to reset