Log In Sign Up

Mathematical Foundations of Graph-Based Bayesian Semi-Supervised Learning

by   Nicolas Garcia Trillos, et al.

In recent decades, science and engineering have been revolutionized by a momentous growth in the amount of available data. However, despite the unprecedented ease with which data are now collected and stored, labeling data by supplementing each feature with an informative tag remains to be challenging. Illustrative tasks where the labeling process requires expert knowledge or is tedious and time-consuming include labeling X-rays with a diagnosis, protein sequences with a protein type, texts by their topic, tweets by their sentiment, or videos by their genre. In these and numerous other examples, only a few features may be manually labeled due to cost and time constraints. How can we best propagate label information from a small number of expensive labeled features to a vast number of unlabeled ones? This is the question addressed by semi-supervised learning (SSL). This article overviews recent foundational developments on graph-based Bayesian SSL, a probabilistic framework for label propagation using similarities between features. SSL is an active research area and a thorough review of the extant literature is beyond the scope of this article. Our focus will be on topics drawn from our own research that illustrate the wide range of mathematical tools and ideas that underlie the rigorous study of the statistical accuracy and computational efficiency of graph-based Bayesian SSL.


page 1

page 2

page 3

page 4


A Flexible Generative Framework for Graph-based Semi-supervised Learning

We consider a family of problems that are concerned about making predict...

Shoestring: Graph-Based Semi-Supervised Learning with Severely Limited Labeled Data

Graph-based semi-supervised learning has been shown to be one of the mos...

On the Supermodularity of Active Graph-based Semi-supervised Learning with Stieltjes Matrix Regularization

Active graph-based semi-supervised learning (AG-SSL) aims to select a sm...

Deep Categorization with Semi-Supervised Self-Organizing Maps

Nowadays, with the advance of technology, there is an increasing amount ...

On Consistency of Graph-based Semi-supervised Learning

Graph-based semi-supervised learning is one of the most popular methods ...

Graph-based Semi-supervised Learning: A Comprehensive Review

Semi-supervised learning (SSL) has tremendous value in practice due to i...

When Less is More: On the Value of "Co-training" for Semi-Supervised Software Defect Predictors

Labeling a module defective or non-defective is an expensive task. Hence...