Supervised Learning: No Loss No Cry

02/10/2020
by   Richard Nock, et al.
0

Supervised learning requires the specification of a loss function to minimise. While the theory of admissible losses from both a computational and statistical perspective is well-developed, these offer a panoply of different choices. In practice, this choice is typically made in an ad hoc manner. In hopes of making this procedure more principled, the problem of learning the loss function for a downstream task (e.g., classification) has garnered recent interest. However, works in this area have been generally empirical in nature. In this paper, we revisit the SLIsotron algorithm of Kakade et al. (2011) through a novel lens, derive a generalisation based on Bregman divergences, and show how it provides a principled procedure for learning the loss. In detail, we cast SLIsotron as learning a loss from a family of composite square losses. By interpreting this through the lens of proper losses, we derive a generalisation of SLIsotron based on Bregman divergences. The resulting BregmanTron algorithm jointly learns the loss along with the classifier. It comes equipped with a simple guarantee of convergence for the loss it learns, and its set of possible outputs comes with a guarantee of agnostic approximability of Bayes rule. Experiments indicate that the BregmanTron substantially outperforms the SLIsotron, and that the loss it learns can be minimized by other algorithms for different tasks, thereby opening the interesting problem of loss transfer between domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/27/2023

LegendreTron: Uprising Proper Multiclass Loss Learning

Loss functions serve as the foundation of supervised learning and are of...
research
06/08/2020

All your loss are belong to Bayes

Loss functions are a cornerstone of machine learning and the starting po...
research
01/08/2019

Learning with Fenchel-Young Losses

Over the past decades, numerous loss functions have been been proposed f...
research
09/01/2022

The Geometry and Calculus of Losses

Statistical decision problems are the foundation of statistical machine ...
research
10/16/2022

Loss Minimization through the Lens of Outcome Indistinguishability

We present a new perspective on loss minimization and the recent notion ...
research
12/28/2016

The Pessimistic Limits of Margin-based Losses in Semi-supervised Learning

We show that for linear classifiers defined by convex margin-based surro...
research
05/19/2022

Learning Energy Networks with Generalized Fenchel-Young Losses

Energy-based models, a.k.a. energy networks, perform inference by optimi...

Please sign up or login with your details

Forgot password? Click here to reset