Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models

09/11/2023
by   Mansi Sakarvadia, et al.
0

Answering multi-hop reasoning questions requires retrieving and synthesizing information from diverse sources. Large Language Models (LLMs) struggle to perform such reasoning consistently. Here we propose an approach to pinpoint and rectify multi-hop reasoning failures through targeted memory injections on LLM attention heads. First, we analyze the per-layer activations of GPT-2 models in response to single and multi-hop prompts. We then propose a mechanism that allows users to inject pertinent prompt-specific information, which we refer to as "memories," at critical LLM locations during inference. By thus enabling the LLM to incorporate additional relevant information during inference, we enhance the quality of multi-hop prompt completions. We show empirically that a simple, efficient, and targeted memory injection into a key attention layer can often increase the probability of the desired next token in multi-hop tasks, by up to 424

READ FULL TEXT

page 6

page 12

page 13

research
09/14/2022

Prompt-based Conservation Learning for Multi-hop Question Answering

Multi-hop question answering (QA) requires reasoning over multiple docum...
research
06/06/2023

Triggering Multi-Hop Reasoning for Question Answering in Language Models using Soft Prompts and Random Walks

Despite readily memorizing world knowledge about entities, pre-trained l...
research
09/16/2023

Multimodal Multi-Hop Question Answering Through a Conversation Between Tools and Efficiently Finetuned Large Language Models

We employ a tool-interacting divide-and-conquer strategy enabling large ...
research
09/14/2021

Building Accurate Simple Models with Multihop

Knowledge transfer from a complex high performing model to a simpler and...
research
05/22/2020

A Complex KBQA System using Multiple Reasoning Paths

Multi-hop knowledge based question answering (KBQA) is a complex task fo...
research
05/01/2023

Learning to Reason and Memorize with Self-Notes

Large language models have been shown to struggle with limited context m...
research
05/24/2019

Differentiable Representations For Multihop Inference Rules

We present efficient differentiable implementations of second-order mult...

Please sign up or login with your details

Forgot password? Click here to reset