Lifting Weak Supervision To Structured Prediction

11/24/2022
by   Harit Vishwakarma, et al.
0

Weak supervision (WS) is a rich set of techniques that produce pseudolabels by aggregating easily obtained but potentially noisy label estimates from a variety of sources. WS is theoretically well understood for binary classification, where simple approaches enable consistent estimation of pseudolabel noise rates. Using this result, it has been shown that downstream models trained on the pseudolabels have generalization guarantees nearly identical to those trained on clean labels. While this is exciting, users often wish to use WS for structured prediction, where the output space consists of more than a binary or multi-class label set: e.g. rankings, graphs, manifolds, and more. Do the favorable theoretical properties of WS for binary classification lift to this setting? We answer this question in the affirmative for a wide range of scenarios. For labels taking values in a finite metric space, we introduce techniques new to weak supervision based on pseudo-Euclidean embeddings and tensor decompositions, providing a nearly-consistent noise rate estimator. For labels in constant-curvature Riemannian manifolds, we introduce new invariants that also yield consistent noise rate estimation. In both cases, when using the resulting pseudolabels in concert with a flexible downstream model, we obtain generalization guarantees nearly identical to those for models trained on clean data. Several of our results, which can be viewed as robustness guarantees in structured prediction with noisy labels, may be of independent interest. Empirical evaluation validates our claims and shows the merits of the proposed method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/07/2021

Universalizing Weak Supervision

Weak supervision (WS) frameworks are a popular way to bypass hand-labeli...
research
03/21/2022

Multi-class Label Noise Learning via Loss Decomposition and Centroid Estimation

In real-world scenarios, many large-scale datasets often contain inaccur...
research
05/15/2022

Meta Self-Refinement for Robust Learning with Weak Supervision

Training deep neural networks (DNNs) with weak supervision has been a ho...
research
05/11/2022

Weak Supervision with Incremental Source Accuracy Estimation

Motivated by the desire to generate labels for real-time data we develop...
research
01/24/2021

Analysing the Noise Model Error for Realistic Noisy Label Data

Distant and weak supervision allow to obtain large amounts of labeled tr...
research
10/19/2020

GANs for learning from very high class conditional noisy labels

We use Generative Adversarial Networks (GANs) to design a class conditio...
research
12/04/2022

Label Encoding for Regression Networks

Deep neural networks are used for a wide range of regression problems. H...

Please sign up or login with your details

Forgot password? Click here to reset