Minimizing Close-k Aggregate Loss Improves Classification

11/01/2018
by   Bryan He, et al.
0

In classification, the de facto method for aggregating individual losses is the average loss. When the actual metric of interest is 0-1 loss, it is common to minimize the average surrogate loss for some well-behaved (e.g. convex) surrogate. Recently, several other aggregate losses such as the maximal loss and average top-k loss were proposed as alternative objectives to address shortcomings of the average loss. However, we identify common classification settings, e.g. the data is imbalanced, has too many easy or ambiguous examples, etc., when average, maximal and average top-k all suffer from suboptimal decision boundaries, even on an infinitely large training set. To address this problem, we propose a new classification objective called the close-k aggregate loss, where we adaptively minimize the loss for points close to the decision boundary. We provide theoretical guarantees for the 0-1 accuracy when we optimize close-k aggregate loss. We also conduct systematic experiments across the PMLB and OpenML benchmark datasets. Close-k achieves significant gains in 0-1 test accuracy, improvements of ≥ 2 of the datasets compared to average, maximal and average top-k. In contrast, the previous aggregate losses outperformed close-k in less than 2 datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/18/2022

Rank-based Decomposable Losses in Machine Learning: A Survey

Recent works have revealed an essential paradigm in designing loss funct...
research
10/05/2020

Learning by Minimizing the Sum of Ranked Range

In forming learning objectives, one oftentimes needs to aggregate a set ...
research
05/24/2017

Learning with Average Top-k Loss

In this work, we introduce the average top-k (AT_k) loss as a new ensemb...
research
09/14/2018

Efficient Structured Surrogate Loss and Regularization in Structured Prediction

In this dissertation, we focus on several important problems in structur...
research
06/07/2021

Sum of Ranked Range Loss for Supervised Learning

In forming learning objectives, one oftentimes needs to aggregate a set ...
research
10/26/2021

Boosted CVaR Classification

Many modern machine learning tasks require models with high tail perform...
research
04/21/2022

The average size of maximal matchings in graphs

We investigate the ratio I(G) of the average size of a maximal matching ...

Please sign up or login with your details

Forgot password? Click here to reset