Data driven algorithms for limited labeled data learning

03/18/2021
by   Maria-Florina Balcan, et al.
0

We consider a novel data driven approach for designing learning algorithms that can effectively learn with only a small number of labeled examples. This is crucial for modern machine learning applications where labels are scarce or expensive to obtain. We focus on graph-based techniques, where the unlabeled examples are connected in a graph under the implicit assumption that similar nodes likely have similar labels. Over the past decades, several elegant graph-based semi-supervised and active learning algorithms for how to infer the labels of the unlabeled examples given the graph and a few labeled examples have been proposed. However, the problem of how to create the graph (which impacts the practical usefulness of these methods significantly) has been relegated to domain-specific art and heuristics and no general principles have been proposed. In this work we present a novel data driven approach for learning the graph and provide strong formal guarantees in both the distributional and online learning formalizations. We show how to leverage problem instances coming from an underlying problem domain to learn the graph hyperparameters from commonly used parametric families of graphs that perform well on new instances coming from the same domain. We obtain low regret and efficient algorithms in the online setting, and generalization guarantees in the distributional setting. We also show how to combine several very different similarity metrics and learn multiple hyperparameters, providing general techniques to apply to large classes of problems. We expect some of the tools and techniques we develop along the way to be of interest beyond semi-supervised and active learning, for data driven algorithms for combinatorial problems more generally.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2023

Efficiently Learning the Graph for Semi-supervised Learning

Computational efficiency is a major bottleneck in using classic graph-ba...
research
03/05/2023

Neuroevolutionary algorithms driven by neuron coverage metrics for semi-supervised classification

In some machine learning applications the availability of labeled instan...
research
11/19/2015

Semi-supervised Learning for Convolutional Neural Networks via Online Graph Construction

The recent promising achievements of deep learning rely on the large amo...
research
03/31/2022

Graph-based Active Learning for Semi-supervised Classification of SAR Data

We present a novel method for classification of Synthetic Aperture Radar...
research
05/22/2022

Deep Low-Density Separation for Semi-Supervised Classification

Given a small set of labeled data and a large set of unlabeled data, sem...
research
09/18/2023

A Semi-Supervised Approach for Power System Event Identification

Event identification is increasingly recognized as crucial for enhancing...
research
11/14/2020

Data-driven Algorithm Design

Data driven algorithm design is an important aspect of modern data scien...

Please sign up or login with your details

Forgot password? Click here to reset