The Turking Test: Can Language Models Understand Instructions?

10/22/2020
by   Avia Efrat, et al.
0

Supervised machine learning provides the learner with a set of input-output examples of the target task. Humans, however, can also learn to perform new tasks from instructions in natural language. Can machines learn to understand instructions as well? We present the Turking Test, which examines a model's ability to follow natural language instructions of varying complexity. These range from simple tasks, like retrieving the nth word of a sentence, to ones that require creativity, such as generating examples for SNLI and SQuAD in place of human intelligence workers ("turkers"). Despite our lenient evaluation methodology, we observe that a large pretrained language model performs poorly across all tasks. Analyzing the model's error patterns reveals that the model tends to ignore explicit instructions and often generates outputs that cannot be construed as an attempt to solve the task. While it is not yet clear whether instruction understanding can be captured by traditional language models, the sheer expressivity of instruction understanding makes it an appealing alternative to the rising few-shot inference paradigm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2022

Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor

Instruction tuning enables pretrained language models to perform new tas...
research
05/22/2022

Instruction Induction: From Few Examples to Natural Language Task Descriptions

Large language models are able to perform a task by conditioning on a fe...
research
04/12/2023

LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity

Cross-task generalization is a significant outcome that defines mastery ...
research
01/17/2023

Are Language Models Worse than Humans at Following Prompts? It's Complicated

Prompts have been the center of progress in advancing language models' z...
research
12/04/2022

Understanding How Model Size Affects Few-shot Instruction Prompting

Large Language Models are affected by the phenomena of memorizing and fo...
research
09/02/2021

Do Prompt-Based Models Really Understand the Meaning of their Prompts?

Recently, a boom of papers have shown extraordinary progress in few-shot...
research
07/12/2017

Source-Target Inference Models for Spatial Instruction Understanding

Models that can execute natural language instructions for situated robot...

Please sign up or login with your details

Forgot password? Click here to reset