Information-Theoretic Generalization Bounds for Iterative Semi-Supervised Learning

10/03/2021
by   Haiyun He, et al.
0

We consider iterative semi-supervised learning (SSL) algorithms that iteratively generate pseudo-labels for a large amount unlabelled data to progressively refine the model parameters. In particular, we seek to understand the behaviour of the generalization error of iterative SSL algorithms using information-theoretic principles. To obtain bounds that are amenable to numerical evaluation, we first work with a simple model – namely, the binary Gaussian mixture model. Our theoretical results suggest that when the class conditional variances are not too large, the upper bound on the generalization error decreases monotonically with the number of iterations, but quickly saturates. The theoretical results on the simple model are corroborated by extensive experiments on several benchmark datasets such as the MNIST and CIFAR datasets in which we notice that the generalization error improves after several pseudo-labelling iterations, but saturates afterwards.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2022

Why pseudo label based algorithm is effective? –from the perspective of pseudo labeled data

Recently, pseudo label based semi-supervised learning has achieved great...
research
02/04/2019

Generalization Bounds For Unsupervised and Semi-Supervised Learning With Autoencoders

Autoencoders are widely used for unsupervised learning and as a regulari...
research
09/12/2019

Generating Accurate Pseudo-labels via Hermite Polynomials for SSL Confidently

Rectified Linear Units (ReLUs) are among the most widely used activation...
research
10/23/2020

Jensen-Shannon Information Based Characterization of the Generalization Error of Learning Algorithms

Generalization error bounds are critical to understanding the performanc...
research
05/16/2022

Sharp Asymptotics of Self-training with Linear Classifier

Self-training (ST) is a straightforward and standard approach in semi-su...
research
06/19/2020

Statistical and Algorithmic Insights for Semi-supervised Learning with Self-training

Self-training is a classical approach in semi-supervised learning which ...
research
10/18/2022

Information-theoretic Characterizations of Generalization Error for the Gibbs Algorithm

Various approaches have been developed to upper bound the generalization...

Please sign up or login with your details

Forgot password? Click here to reset