sigmoidF1: A Smooth F1 Score Surrogate Loss for Multilabel Classification

08/24/2021
by   Gabriel Bénédict, et al.
30

Multiclass multilabel classification refers to the task of attributing multiple labels to examples via predictions. Current models formulate a reduction of that multilabel setting into either multiple binary classifications or multiclass classification, allowing for the use of existing loss functions (sigmoid, cross-entropy, logistic, etc.). Empirically, these methods have been reported to achieve good performance on different metrics (F1 score, Recall, Precision, etc.). Theoretically though, the multilabel classification reductions does not accommodate for the prediction of varying numbers of labels per example and the underlying losses are distant estimates of the performance metrics. We propose a loss function, sigmoidF1. It is an approximation of the F1 score that (I) is smooth and tractable for stochastic gradient descent, (II) naturally approximates a multilabel metric, (III) estimates label propensities and label counts. More generally, we show that any confusion matrix metric can be formulated with a smooth surrogate. We evaluate the proposed loss function on different text and image datasets, and with a variety of metrics, to account for the complexity of multilabel classification evaluation. In our experiments, we embed the sigmoidF1 loss in a classification head that is attached to state-of-the-art efficient pretrained neural networks MobileNetV2 and DistilBERT. Our experiments show that sigmoidF1 outperforms other loss functions on four datasets and several metrics. These results show the effectiveness of using inference-time metrics as loss function at training time in general and their potential on non-trivial classification problems like multilabel classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/03/2021

A surrogate loss function for optimization of F_β score in binary classification with imbalanced data

The F_β score is a commonly used measure of classification performance, ...
research
05/24/2019

Learning Surrogate Losses

The minimization of loss functions is the heart and soul of Machine Lear...
research
08/18/2020

Grading Loss: A Fracture Grade-based Metric Loss for Vertebral Fracture Detection

Osteoporotic vertebral fractures have a severe impact on patients' overa...
research
09/23/2021

Unbiased Loss Functions for Multilabel Classification with Missing Labels

This paper considers binary and multilabel classification problems in a ...
research
09/14/2022

Meta Pattern Concern Score: A Novel Metric for Customizable Evaluation of Multi-classification

Classifiers have been widely implemented in practice, while how to evalu...
research
02/26/2022

Relational Surrogate Loss Learning

Evaluation metrics in machine learning are often hardly taken as loss fu...
research
02/04/2022

Stochastic smoothing of the top-K calibrated hinge loss for deep imbalanced classification

In modern classification tasks, the number of labels is getting larger a...

Please sign up or login with your details

Forgot password? Click here to reset