DeepAI AI Chat
Log In Sign Up

Large Language Models Can Be Easily Distracted by Irrelevant Context

by   Freda Shi, et al.

Large language models have achieved impressive performance on various natural language processing tasks. However, so far they have been evaluated primarily on benchmarks where all information in the input context is relevant for solving the task. In this work, we investigate the distractibility of large language models, i.e., how the model problem-solving accuracy can be influenced by irrelevant context. In particular, we introduce Grade-School Math with Irrelevant Context (GSM-IC), an arithmetic reasoning dataset with irrelevant information in the problem description. We use this benchmark to measure the distractibility of cutting-edge prompting techniques for large language models, and find that the model performance is dramatically decreased when irrelevant information is included. We also identify several approaches for mitigating this deficiency, such as decoding with self-consistency and adding to the prompt an instruction that tells the language model to ignore the irrelevant information.


page 1

page 2

page 3

page 4


Large Language Models with Controllable Working Memory

Large language models (LLMs) have led to a series of breakthroughs in na...

Larger language models do in-context learning differently

We study how in-context learning (ICL) in language models is affected by...

Solving Quantitative Reasoning Problems with Language Models

Language models have achieved remarkable performance on a wide range of ...

RankGen: Improving Text Generation with Large Ranking Models

Given an input sequence (or prefix), modern language models often assign...

LMentry: A Language Model Benchmark of Elementary Language Tasks

As the performance of large language models rapidly improves, benchmarks...

Ignore Previous Prompt: Attack Techniques For Language Models

Transformer-based large language models (LLMs) provide a powerful founda...

Sorting through the noise: Testing robustness of information processing in pre-trained language models

Pre-trained LMs have shown impressive performance on downstream NLP task...