An Efficient Large-scale Semi-supervised Multi-label Classifier Capable of Handling Missing labels

06/18/2016
by   Amirhossein Akbarnejad, et al.
0

Multi-label classification has received considerable interest in recent years. Multi-label classifiers have to address many problems including: handling large-scale datasets with many instances and a large set of labels, compensating missing label assignments in the training set, considering correlations between labels, as well as exploiting unlabeled data to improve prediction performance. To tackle datasets with a large set of labels, embedding-based methods have been proposed which seek to represent the label assignments in a low-dimensional space. Many state-of-the-art embedding-based methods use a linear dimensionality reduction to represent the label assignments in a low-dimensional space. However, by doing so, these methods actually neglect the tail labels - labels that are infrequently assigned to instances. We propose an embedding-based method that non-linearly embeds the label vectors using an stochastic approach, thereby predicting the tail labels more accurately. Moreover, the proposed method have excellent mechanisms for handling missing labels, dealing with large-scale datasets, as well as exploiting unlabeled data. With the best of our knowledge, our proposed method is the first multi-label classifier that simultaneously addresses all of the mentioned challenges. Experiments on real-world datasets show that our method outperforms stateof-the-art multi-label classifiers by a large margin, in terms of prediction performance, as well as training time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/24/2018

Group Preserving Label Embedding for Multi-Label Classification

Multi-label learning is concerned with the classification of data with m...
research
02/20/2019

Noisy multi-label semi-supervised dimensionality reduction

Noisy labeled data represent a rich source of information that often are...
research
03/19/2022

Font Generation with Missing Impression Labels

Our goal is to generate fonts with specific impressions, by training a g...
research
03/14/2017

On the benefits of output sparsity for multi-label classification

The multi-label classification framework, where each observation can be ...
research
06/19/2017

Multi-Label Annotation Aggregation in Crowdsourcing

As a means of human-based computation, crowdsourcing has been widely use...
research
06/24/2020

Multilabel Classification by Hierarchical Partitioning and Data-dependent Grouping

In modern multilabel classification problems, each data instance belongs...
research
09/08/2016

DiSMEC - Distributed Sparse Machines for Extreme Multi-label Classification

Extreme multi-label classification refers to supervised multi-label lear...

Please sign up or login with your details

Forgot password? Click here to reset