Discriminator-Guided Multi-step Reasoning with Language Models

05/24/2023
by   Muhammad Khalifa, et al.
0

In the context of multi-step reasoning, language models (LMs) probabilities are often miscalibrated – solutions with high probabilities are not always correct. Therefore, greedy decoding, which is the standard decoding method for reasoning tasks, often yields incorrect solutions. In addition, methods such as self-consistency and verifiers rely on sampling from the LM distribution and do not tackle the underlying issue. To address this, we introduce Guiding Multi-step ReAsoning with a CorrectnEss Discriminator (GRACE), a stepwise decoding approach that nudges the model towards producing correct reasoning steps. GRACE employs a discriminator model, which is trained to differentiate correct steps from invalid ones, to adjust decoding preferences based on the correctness of each reasoning step. Importantly, GRACE does not require fine-tuning or re-training the LMs. When compared with conventional decoding strategies over four popular math reasoning benchmarks, GRACE exhibits significant improvements in both final answer accuracy and step correctness, outperforming both greedy decoding and self-consistency.[Our code can be found at <https://github.com/mukhal/grace.>]

READ FULL TEXT
research
05/01/2023

Decomposition Enhances Reasoning via Self-Evaluation Guided Decoding

We endow Large Language Models (LLMs) with fine-grained self-evaluation ...
research
04/21/2023

ReCEval: Evaluating Reasoning Chains via Correctness and Informativeness

Multi-step reasoning ability is fundamental to many natural language tas...
research
05/23/2023

Two Failures of Self-Consistency in the Multi-Step Reasoning of LLMs

Large language models (LLMs) have achieved widespread success on a varie...
research
05/19/2023

Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning with LLMs

A popular approach for improving the correctness of output from large la...
research
06/06/2022

On the Advance of Making Language Models Better Reasoners

Large language models such as GPT-3 and PaLM have shown remarkable perfo...
research
09/01/2023

No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function

Large language models (LLMs) exhibit impressive language understanding a...
research
08/16/2023

Detoxify Language Model Step-by-Step

Detoxification for LLMs is challenging since it requires models to avoid...

Please sign up or login with your details

Forgot password? Click here to reset