Language Models in the Loop: Incorporating Prompting into Weak Supervision

05/04/2022
by   Ryan Smith, et al.
2

We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited. Rather than apply the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions in a weak supervision framework. To create a classifier, we first prompt the model to answer multiple distinct queries about an example and define how the possible responses should be mapped to votes for labels and abstentions. We then denoise these noisy label sources using the Snorkel system and train an end classifier with the resulting training data. Our experimental evaluation shows that prompting large language models within a weak supervision framework can provide significant gains in accuracy. On the WRENCH weak supervision benchmark, this approach can significantly improve over zero-shot performance, an average 19.5 approach produces classifiers with comparable or superior accuracy to those trained from hand-engineered rules.

READ FULL TEXT

page 13

page 27

page 28

research
01/07/2021

Ask2Transformers: Zero-Shot Domain labelling with Pre-trained Language Models

In this paper we present a system that exploits different pre-trained La...
research
06/08/2021

Learning from Multiple Noisy Partial Labelers

Programmatic weak supervision creates models without hand-labeled traini...
research
08/30/2022

AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels

Weak supervision (WS) is a powerful method to build labeled datasets for...
research
05/29/2023

Alfred: A System for Prompted Weak Supervision

Alfred is the first system for programmatic weak supervision (PWS) that ...
research
09/14/2023

Detecting Misinformation with LLM-Predicted Credibility Signals and Weak Supervision

Credibility signals represent a wide range of heuristics that are typica...
research
08/04/2023

Learning to Paraphrase Sentences to Different Complexity Levels

While sentence simplification is an active research topic in NLP, its ad...
research
12/02/2018

Snorkel DryBell: A Case Study in Deploying Weak Supervision at Industrial Scale

Labeling training data is one of the most costly bottlenecks in developi...

Please sign up or login with your details

Forgot password? Click here to reset