Understanding Arithmetic Reasoning in Language Models using Causal Mediation Analysis

05/24/2023
by   Alessandro Stolfo, et al.
0

Mathematical reasoning in large language models (LLMs) has garnered attention in recent research, but there is limited understanding of how these models process and store information related to arithmetic tasks. In this paper, we present a mechanistic interpretation of LLMs for arithmetic-based questions using a causal mediation analysis framework. By intervening on the activations of specific model components and measuring the resulting changes in predicted probabilities, we identify the subset of parameters responsible for specific predictions. We analyze two pre-trained language models with different sizes (2.8B and 6B parameters). Experimental results reveal that a small set of mid-late layers significantly affect predictions for arithmetic-based questions, with distinct activation patterns for correct and wrong predictions. We also investigate the role of the attention mechanism and compare the model's activation patterns for arithmetic queries with the prediction of factual knowledge. Our findings provide insights into the mechanistic interpretation of LLMs for arithmetic tasks and highlight the specific components involved in arithmetic reasoning.

READ FULL TEXT
research
03/04/2023

MathPrompter: Mathematical Reasoning using Large Language Models

Large Language Models (LLMs) have limited performance when solving arith...
research
07/07/2023

Discovering Variable Binding Circuitry with Desiderata

Recent work has shown that computation in language models may be human-u...
research
05/25/2023

Language Models Implement Simple Word2Vec-style Vector Arithmetic

A primary criticism towards language models (LMs) is their inscrutabilit...
research
06/01/2022

What Changed? Investigating Debiasing Methods using Causal Mediation Analysis

Previous work has examined how debiasing language models affect downstre...
research
10/21/2022

A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models

We have recently witnessed a number of impressive results on hard mathem...
research
01/31/2023

Numeracy from Literacy: Data Science as an Emergent Skill from Large Language Models

Large language models (LLM) such as OpenAI's ChatGPT and GPT-3 offer uni...
research
02/24/2023

Analyzing And Editing Inner Mechanisms Of Backdoored Language Models

Recent advancements in interpretability research made transformer langua...

Please sign up or login with your details

Forgot password? Click here to reset