Rademacher Complexity Bounds for a Penalized Multiclass Semi-Supervised Algorithm

07/02/2016
by   Yury Maximov, et al.
0

We propose Rademacher complexity bounds for multiclass classifiers trained with a two-step semi-supervised model. In the first step, the algorithm partitions the partially labeled data and then identifies dense clusters containing κ predominant classes using the labeled training examples such that the proportion of their non-predominant classes is below a fixed threshold. In the second step, a classifier is trained by minimizing a margin empirical loss over the labeled training set and a penalization term measuring the disability of the learner to predict the κ predominant classes of the identified clusters. The resulting data-dependent generalization error bound involves the margin distribution of the classifier, the stability of the clustering technique used in the first step and Rademacher complexity terms corresponding to partially labeled training data. Our theoretical result exhibit convergence rates extending those proposed in the literature for the binary case, and experimental results on different multiclass classification problems show empirical evidence that supports the theory.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/12/2016

Dual Teaching: A Practical Semi-supervised Wrapper Method

Semi-supervised wrapper methods are concerned with building effective su...
research
11/29/2021

Self-Training of Halfspaces with Generalization Guarantees under Massart Mislabeling Noise Model

We investigate the generalization properties of a self-training algorith...
research
06/22/2011

Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction

For large, real-world inductive learning problems, the number of trainin...
research
05/20/2016

Fast Randomized Semi-Supervised Clustering

We consider the problem of clustering partially labeled data from a mini...
research
03/24/2022

Addressing Missing Sources with Adversarial Support-Matching

When trained on diverse labeled data, machine learning models have prove...
research
02/24/2022

Self-Training: A Survey

In recent years, semi-supervised algorithms have received a lot of inter...
research
05/04/2017

Semi-supervised model-based clustering with controlled clusters leakage

In this paper, we focus on finding clusters in partially categorized dat...

Please sign up or login with your details

Forgot password? Click here to reset