Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification

02/01/2021
by   Shida Lei, et al.
0

To cope with high annotation costs, training a classifier only from weakly supervised data has attracted a great deal of attention these days. Among various approaches, strengthening supervision from completely unsupervised classification is a promising direction, which typically employs class priors as the only supervision and trains a binary classifier from unlabeled (U) datasets. While existing risk-consistent methods are theoretically grounded with high flexibility, they can learn only from two U sets. In this paper, we propose a new approach for binary classification from m U-sets for m≥2. Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC), which is aimed at predicting from which U set each observed data is drawn. SSC can be solved by a standard (multi-class) classification method, and we use the SSC solution to obtain the final binary classifier through a certain linear-fractional transformation. We built our method in a flexible and efficient end-to-end deep learning framework and prove it to be classifier-consistent. Through experiments, we demonstrate the superiority of our proposed method over state-of-the-art methods.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 11

08/31/2018

On the Minimal Supervision for Training Any Binary Classifier from Only Unlabeled Data

Empirical risk minimization (ERM), with proper loss function and regular...
10/19/2017

Binary Classification from Positive-Confidence Data

Reducing labeling costs in supervised learning is a critical issue in ma...
11/05/2020

Binary classification with ambiguous training data

In supervised learning, we often face with ambiguous (A) samples that ar...
02/11/2021

Continuum centroid classifier for functional data

Aiming at the binary classification of functional data, we propose the c...
06/12/2019

Leveraging Labeled and Unlabeled Data for Consistent Fair Binary Classification

We study the problem of fair binary classification using the notion of E...
03/11/2022

Classification from Positive and Biased Negative Data with Skewed Labeled Posterior Probability

The binary classification problem has a situation where only biased data...
03/24/2022

The Dutch Draw: Constructing a Universal Baseline for Binary Prediction Models

Novel prediction methods should always be compared to a baseline to know...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.