Multi-label classification: do Hamming loss and subset accuracy really conflict with each other?

11/16/2020
by   Guoqiang Wu, et al.
0

Various evaluation measures have been developed for multi-label classification, including Hamming Loss (HL), Subset Accuracy (SA) and Ranking Loss (RL). However, there is a gap between empirical results and the existing theories: 1) an algorithm often empirically performs well on some measure(s) while poorly on others, while a formal theoretical analysis is lacking; and 2) in small label space cases, the algorithms optimizing HL often have comparable or even better performance on the SA measure than those optimizing SA directly, while existing theoretical results show that SA and HL are conflicting measures. This paper provides an attempt to fill up this gap by analyzing the learning guarantees of the corresponding learning algorithms on both SA and HL measures. We show that when a learning algorithm optimizes HL with its surrogate loss, it enjoys an error bound for the HL measure independent of c (the number of labels), while the bound for the SA measure depends on at most O(c). On the other hand, when directly optimizing SA with its surrogate loss, it has learning guarantees that depend on O(√(c)) for both HL and SA measures. This explains the observation that when the label space is not large, optimizing HL with its surrogate loss can have promising performance for SA. We further show that our techniques are applicable to analyze the learning guarantees of algorithms on other measures, such as RL. Finally, the theoretical analyses are supported by experimental results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/10/2021

Rethinking and Reweighting the Univariate Losses for Multi-Label Ranking: Consistency and Generalization

(Partial) ranking loss is a commonly used evaluation measure for multi-l...
research
09/16/2020

Convex Calibrated Surrogates for the Multi-Label F-Measure

The F-measure is a widely used performance measure for multi-label class...
research
11/02/2020

A Flexible Class of Dependence-aware Multi-Label Loss Functions

Multi-label classification is the task of assigning a subset of labels t...
research
06/07/2016

How is a data-driven approach better than random choice in label space division for multi-label classification?

We propose using five data-driven community detection approaches from so...
research
11/15/2019

Multi-Label Learning with Deep Forest

In multi-label learning, each instance is associated with multiple label...
research
05/09/2023

Towards Understanding Generalization of Macro-AUC in Multi-label Learning

Macro-AUC is the arithmetic mean of the class-wise AUCs in multi-label l...
research
10/27/2018

Handling Imbalanced Dataset in Multi-label Text Categorization using Bagging and Adaptive Boosting

Imbalanced dataset is occurred due to uneven distribution of data availa...

Please sign up or login with your details

Forgot password? Click here to reset