WRENCH: A Comprehensive Benchmark for Weak Supervision

09/23/2021
by   Jieyu Zhang, et al.
0

Recent Weak Supervision (WS) approaches have had widespread success in easing the bottleneck of labeling training data for machine learning by synthesizing labels from multiple potentially noisy supervision sources. However, proper measurement and analysis of these approaches remain a challenge. First, datasets used in existing works are often private and/or custom, limiting standardization. Second, WS datasets with the same name and base data often vary in terms of the labels and weak supervision sources used, a significant "hidden" source of evaluation variance. Finally, WS studies often diverge in terms of the evaluation protocol and ablations used. To address these problems, we introduce a benchmark platform, , for a thorough and standardized evaluation of WS approaches. It consists of 22 varied real-world datasets for classification and sequence tagging; a range of real, synthetic, and procedurally-generated weak supervision sources; and a modular, extensible framework for WS evaluation, including implementations for popular WS methods. We use to conduct extensive comparisons over more than 100 method variants to demonstrate its efficacy as a benchmark platform. The code is available at <https://github.com/JieyuZ2/wrench>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2022

A Survey on Programmatic Weak Supervision

Labeling training data has become one of the major roadblocks to using m...
research
08/28/2021

WALNUT: A Benchmark on Weakly Supervised Learning for Natural Language Understanding

Building quality machine learning models for natural language understand...
research
03/14/2019

Learning Dependency Structures for Weak Supervision Models

Labeling training data is a key bottleneck in the modern machine learnin...
research
05/29/2023

Alfred: A System for Prompted Weak Supervision

Alfred is the first system for programmatic weak supervision (PWS) that ...
research
06/23/2022

pyKT: A Python Library to Benchmark Deep Learning based Knowledge Tracing Models

Knowledge tracing (KT) is the task of using students' historical learnin...
research
08/02/2022

Binary Classification with Positive Labeling Sources

To create a large amount of training labels for machine learning models ...
research
05/25/2022

Understanding Programmatic Weak Supervision via Source-aware Influence Function

Programmatic Weak Supervision (PWS) aggregates the source votes of multi...

Please sign up or login with your details

Forgot password? Click here to reset