Large Language Models are reasoners with Self-Verification

12/19/2022
by   Yixuan Weng, et al.
0

When a large language model (LLM) performs complex reasoning by chain of thought (CoT), it can be highly sensitive to individual mistakes. We have had to train verifiers to address this issue. As we all know, after human inferring a conclusion, they often check it by re-verifying it, which can avoid some mistakes. We propose a new method called self-verification that uses the conclusion of the CoT as a condition to build a new sample and asks the LLM to re-predict the original conditions which be masked. We calculate an explainable verification score based on the accuracy. This method can improve the accuracy of multiple arithmetics and logical reasoning datasets when using few-shot learning. we have demonstrated that LLMs can conduct explainable self-verification of their own conclusions and achieve competitive reasoning performance. Extensive experimentals have demonstrated that our method can help multiple large language models with self-verification can avoid interference from incorrect CoT. Code is available at <https://github.com/WENGSYX/Self-Verification>

READ FULL TEXT

page 8

page 17

page 18

page 19

page 20

page 21

page 23

research
06/06/2023

Deductive Verification of Chain-of-Thought Reasoning

Large Language Models (LLMs) significantly benefit from Chain-of-Thought...
research
08/08/2023

Cumulative Reasoning with Large Language Models

While language models are powerful and versatile, they often fail to add...
research
04/12/2023

Boosted Prompt Ensembles for Large Language Models

Methods such as chain-of-thought prompting and self-consistency have pus...
research
04/19/2022

Table-based Fact Verification with Self-adaptive Mixture of Experts

The table-based fact verification task has recently gained widespread at...
research
08/15/2023

Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification

Recent progress in large language models (LLMs) like GPT-4 and PaLM-2 ha...
research
05/23/2023

Automatic Model Selection with Large Language Models for Reasoning

Chain-of-Thought and Program-Aided Language Models represent two distinc...
research
06/22/2023

DiversiGATE: A Comprehensive Framework for Reliable Large Language Models

In this paper, we introduce DiversiGATE, a unified framework that consol...

Please sign up or login with your details

Forgot password? Click here to reset