Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks

11/22/2022
by   Wenhu Chen, et al.
0

Recently, there has been significant progress in teaching language models to perform step-by-step reasoning to solve complex numerical reasoning tasks. Chain-of-thoughts prompting (CoT) is by far the state-of-art method for these tasks. CoT uses language models to perform both reasoning and computation in the multi-step `thought' process. To disentangle computation from reasoning, we propose `Program of Thoughts' (PoT), which uses language models (mainly Codex) to express the reasoning process as a program. The computation is relegated to an external computer, which executes the generated programs to derive the answer. We evaluate PoT on five math word problem datasets (GSM, AQuA, SVAMP, TabMWP, MultiArith) and three financial-QA datasets (FinQA, ConvFinQA, TATQA) for both few-shot and zero-shot setups. Under both few-shot and zero-shot settings, PoT can show an average performance gain over CoT by around 12% across all the evaluated datasets. By combining PoT with self-consistency decoding, we can achieve SoTA performance on all math problem datasets and near-SoTA performance on financial datasets. All of our data and code are released in Github[<https://github.com/wenhuchen/Program-of-Thoughts>].

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2023

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

Large Language Models (LLMs) have shown enhanced capabilities of solving...
research
06/06/2023

Deductive Verification of Chain-of-Thought Reasoning

Large Language Models (LLMs) significantly benefit from Chain-of-Thought...
research
08/01/2023

SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning

The recent progress in large language models (LLMs), especially the inve...
research
10/13/2022

Large Language Models are few(1)-shot Table Reasoners

Recent literature has shown that large language models (LLMs) are genera...
research
05/23/2023

Automatic Model Selection with Large Language Models for Reasoning

Chain-of-Thought and Program-Aided Language Models represent two distinc...
research
11/30/2021

Show Your Work: Scratchpads for Intermediate Computation with Language Models

Large pre-trained language models perform remarkably well on tasks that ...
research
05/29/2023

Leveraging Training Data in Few-Shot Prompting for Numerical Reasoning

Chain-of-thought (CoT) prompting with large language models has proven e...

Please sign up or login with your details

Forgot password? Click here to reset