On the benefits of output sparsity for multi-label classification

03/14/2017
by   Evgenii Chzhen, et al.
0

The multi-label classification framework, where each observation can be associated with a set of labels, has generated a tremendous amount of attention over recent years. The modern multi-label problems are typically large-scale in terms of number of observations, features and labels, and the amount of labels can even be comparable with the amount of observations. In this context, different remedies have been proposed to overcome the curse of dimensionality. In this work, we aim at exploiting the output sparsity by introducing a new loss, called the sparse weighted Hamming loss. This proposed loss can be seen as a weighted version of classical ones, where active and inactive labels are weighted separately. Leveraging the influence of sparsity in the loss function, we provide improved generalization bounds for the empirical risk minimizer, a suitable property for large-scale problems. For this new loss, we derive rates of convergence linear in the underlying output-sparsity rather than linear in the number of labels. In practice, minimizing the associated risk can be performed efficiently by using convex surrogates and modern convex optimization algorithms. We provide experiments on various real-world datasets demonstrating the pertinence of our approach when compared to non-weighted techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/23/2020

Learning Gradient Boosted Multi-label Classification Rules

In multi-label classification, where the evaluation of predictions is le...
research
12/04/2021

Adaptive label thresholding methods for online multi-label classification

Existing online multi-label classification works cannot well handle the ...
research
06/18/2016

An Efficient Large-scale Semi-supervised Multi-label Classifier Capable of Handling Missing labels

Multi-label classification has received considerable interest in recent ...
research
02/15/2017

Nearest Labelset Using Double Distances for Multi-label Classification

Multi-label classification is a type of supervised learning where an ins...
research
02/05/2016

From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification

We propose sparsemax, a new activation function similar to the tradition...
research
11/02/2020

A Flexible Class of Dependence-aware Multi-Label Loss Functions

Multi-label classification is the task of assigning a subset of labels t...
research
07/07/2020

Bidirectional Loss Function for Label Enhancement and Distribution Learning

Label distribution learning (LDL) is an interpretable and general learni...

Please sign up or login with your details

Forgot password? Click here to reset