It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners

09/15/2020
by   Timo Schick, et al.
5

When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown et al., 2020) achieve remarkable few-shot performance on challenging natural language understanding benchmarks. In this work, we show that performance similar to GPT-3 can be obtained with language models whose parameter count is several orders of magnitude smaller. This is achieved by converting textual inputs into cloze questions that contain some form of task description, combined with gradient-based optimization; additionally exploiting unlabeled data gives further improvements. Based on our findings, we identify several key factors required for successful natural language understanding with small language models.

READ FULL TEXT

page 1

page 2

page 3

page 5

page 6

page 9

page 10

page 11

research
07/19/2022

Analyzing Bagging Methods for Language Models

Modern language models leverage increasingly large numbers of parameters...
research
08/03/2023

Baby's CoThought: Leveraging Large Language Models for Enhanced Reasoning in Compact Models

Large Language Models (LLMs) demonstrate remarkable performance on a var...
research
08/25/2022

Shortcut Learning of Large Language Models in Natural Language Understanding: A Survey

Large language models (LLMs) have achieved state-of-the-art performance ...
research
07/05/2022

Machine Learning Model Sizes and the Parameter Gap

We study trends in model size of notable machine learning systems over t...
research
09/04/2023

A Comparative Analysis of Pretrained Language Models for Text-to-Speech

State-of-the-art text-to-speech (TTS) systems have utilized pretrained l...
research
07/29/2021

Demystifying Neural Language Models' Insensitivity to Word-Order

Recent research analyzing the sensitivity of natural language understand...
research
10/28/2022

Just-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTE

Figurative language (e.g., "he flew like the wind") is challenging to un...

Please sign up or login with your details

Forgot password? Click here to reset