Recognition, recall, and retention of few-shot memories in large language models

03/30/2023
by   A. Emin Orhan, et al.
0

The training of modern large language models (LLMs) takes place in a regime where most training examples are seen only a few times by the model during the course of training. What does a model remember about such examples seen only a few times during training and how long does that memory persist in the face of continuous training with new examples? Here, we investigate these questions through simple recognition, recall, and retention experiments with LLMs. In recognition experiments, we ask if the model can distinguish the seen example from a novel example; in recall experiments, we ask if the model can correctly recall the seen example when cued by a part of it; and in retention experiments, we periodically probe the model's memory for the original examples as the model is trained continuously with new examples. We find that a single exposure is generally sufficient for a model to achieve near perfect accuracy even in very challenging recognition experiments. We estimate that the recognition performance of even small language models easily exceeds human recognition performance reported in similar experiments with humans (Shepard, 1967). Achieving near perfect recall takes more exposures, but most models can do it in just 3 exposures. The flip side of this remarkable capacity for fast learning is that precise memories are quickly overwritten: recall performance for the original examples drops steeply over the first 10 training updates with new examples, followed by a more gradual decline. Even after 100K updates, however, some of the original examples are still recalled near perfectly. A qualitatively similar retention pattern has been observed in human long-term memory retention studies before (Bahrick, 1984). Finally, recognition is much more robust to interference than recall and memory for natural language sentences is generally superior to memory for stimuli without structure.

READ FULL TEXT

page 8

page 10

research
05/23/2023

Narrative XL: A Large-scale Dataset For Long-Term Memory Models

Despite their tremendous successes, most large language models do not ha...
research
06/12/2023

Augmenting Language Models with Long-Term Memory

Existing large language models (LLMs) can only afford fix-sized inputs d...
research
04/27/2022

Can deep learning match the efficiency of human visual long-term memory in storing object details?

Humans have a remarkably large capacity to store detailed visual informa...
research
02/19/2021

Calibrate Before Use: Improving Few-Shot Performance of Language Models

GPT-3 can perform numerous tasks when provided a natural language prompt...
research
05/17/2023

MemoryBank: Enhancing Large Language Models with Long-Term Memory

Revolutionary advancements in Large Language Models have drastically res...
research
10/20/2020

Display object alignment may influence location recall in unexpected ways

There is a presumption in human-computer interaction that laying out men...
research
02/09/2023

Flexible, Model-Agnostic Method for Materials Data Extraction from Text Using General Purpose Language Models

Accurate and comprehensive material databases extracted from research pa...

Please sign up or login with your details

Forgot password? Click here to reset