MathPrompter: Mathematical Reasoning using Large Language Models

03/04/2023
by   Shima Imani, et al.
0

Large Language Models (LLMs) have limited performance when solving arithmetic reasoning tasks and often provide incorrect answers. Unlike natural language understanding, math problems typically have a single correct answer, making the task of generating accurate solutions more challenging for LLMs. To the best of our knowledge, we are not aware of any LLMs that indicate their level of confidence in their responses which fuels a trust deficit in these models impeding their adoption. To address this deficiency, we propose `MathPrompter', a technique that improves performance of LLMs on arithmetic problems along with increased reliance in the predictions. MathPrompter uses the Zero-shot chain-of-thought prompting technique to generate multiple Algebraic expressions or Python functions to solve the same math problem in different ways and thereby raise the confidence level in the output results. This is in contrast to other prompt based CoT methods, where there is no check on the validity of the intermediate steps followed. Our technique improves over state-of-the-art on the MultiArith dataset (78.7%→92.5%) evaluated using 175B parameter GPT-based LLM.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2022

PAL: Program-aided Language Models

Large language models (LLMs) have recently demonstrated an impressive ab...
research
05/24/2023

Understanding Arithmetic Reasoning in Language Models using Causal Mediation Analysis

Mathematical reasoning in large language models (LLMs) has garnered atte...
research
05/11/2017

Program Induction by Rationale Generation : Learning to Solve and Explain Algebraic Word Problems

Solving algebraic word problems requires executing a series of arithmeti...
research
05/23/2023

Improving Factuality and Reasoning in Language Models through Multiagent Debate

Large language models (LLMs) have demonstrated remarkable capabilities i...
research
05/24/2023

Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective

Recent studies have discovered that Chain-of-Thought prompting (CoT) can...
research
11/25/2022

Solving math word problems with process- and outcome-based feedback

Recent work has shown that asking language models to generate reasoning ...
research
10/27/2021

Training Verifiers to Solve Math Word Problems

State-of-the-art language models can match human performance on many tas...

Please sign up or login with your details

Forgot password? Click here to reset