Sample Complexity of Nonparametric Semi-Supervised Learning

09/10/2018
by   Chen Dan, et al.
0

We study the sample complexity of semi-supervised learning (SSL) and introduce new assumptions based on the mismatch between a mixture model learned from unlabeled data and the true mixture model induced by the (unknown) class conditional distributions. Under these assumptions, we establish an Ω(K K) labeled sample complexity bound without imposing parametric assumptions, where K is the number of classes. Our results suggest that even in nonparametric settings it is possible to learn a near-optimal classifier using only a few labeled samples. Unlike previous theoretical work which focuses on binary classification, we consider general multiclass classification (K>2), which requires solving a difficult permutation learning problem. This permutation defines a classifier whose classification error is controlled by the Wasserstein distance between mixing measures, and we provide finite-sample results characterizing the behaviour of the excess risk of this classifier. Finally, we describe three algorithms for computing these estimators based on a connection to bipartite graph matching, and perform experiments to illustrate the superiority of the MLE over the majority vote estimator.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2022

A Characterization of Semi-Supervised Adversarially-Robust PAC Learnability

We study the problem of semi-supervised learning of an adversarially-rob...
research
08/26/2019

Improvability Through Semi-Supervised Learning: A Survey of Theoretical Results

Semi-supervised learning is a setting in which one has labeled and unlab...
research
06/30/2019

On the Sample Complexity of HGR Maximal Correlation Functions

The Hirschfeld-Gebelein-Rényi (HGR) maximal correlation and the correspo...
research
12/27/2015

Robust Semi-supervised Least Squares Classification by Implicit Constraints

We introduce the implicitly constrained least squares (ICLS) classifier,...
research
09/17/2017

Semi-supervised learning

Semi-supervised learning deals with the problem of how, if possible, to ...
research
12/20/2022

Nonparametric plug-in classifier for multiclass classification of S.D.E. paths

We study the multiclass classification problem where the features come f...

Please sign up or login with your details

Forgot password? Click here to reset