Zero-shot Text Classification With Generative Language Models

12/10/2019
by   Raul Puri, et al.
0

This work investigates the use of natural language to enable zero-shot model adaptation to new tasks. We use text and metadata from social commenting platforms as a source for a simple pretraining task. We then provide the language model with natural language descriptions of classification tasks as input and train it to generate the correct answer in natural language via a language modeling objective. This allows the model to generalize to new classification tasks without the need for multiple multitask classification heads. We show the zero-shot performance of these generative language models, trained with weak supervision, on six benchmark text classification datasets from the torchtext library. Despite no access to training data, we achieve up to a 45 majority class baselines. These results show that natural language can serve as simple and powerful descriptors for task adaptation. We believe this points the way to new metalearning strategies for text problems.

READ FULL TEXT
research
10/31/2022

Zero-Shot Text Classification with Self-Training

Recent advances in large pretrained language models have increased atten...
research
12/05/2022

I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification

Recent works have shown that unstructured text (documents) from online s...
research
10/07/2020

A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks

Autoregressive language models pretrained on large corpora have been suc...
research
10/16/2022

Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective

We propose a new paradigm for zero-shot learners that is format agnostic...
research
01/28/2022

Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model

Pretrained general-purpose language models can achieve state-of-the-art ...
research
05/23/2023

ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding

We introduce ZeroSCROLLS, a zero-shot benchmark for natural language und...
research
03/13/2023

Large Language Models in the Workplace: A Case Study on Prompt Engineering for Job Type Classification

This case study investigates the task of job classification in a real-wo...

Please sign up or login with your details

Forgot password? Click here to reset