Examining the Inter-Consistency of Large Language Models: An In-depth Analysis via Debate

05/19/2023
by   Kai Xiong, et al.
0

Large Language Models (LLMs) have demonstrated human-like intelligence and are widely used in various applications. However, LLMs still exhibit various kinds of inconsistency problems. Existing works mainly focus on the inconsistency issues within a single LLM, while we investigate the inter-consistency among multiple LLMs, which is critical for collaborating to solve a complex task. To examine whether LLMs can collaborate to ultimately achieve a consensus for the shared goal and whether LLMs easily change their viewpoints, we introduce a Formal Debate framework (FORD) With FORD, we conduct a three-stage debate aligned with real-world scenarios: fair debate, mismatched debate, and roundtable debate. Through extensive experiments on the commonsense reasoning task, LLMs not only become more inter-consistent but also achieve higher performance. Moreover, we observe that stronger LLMs tend to dominate the debates by adhering to their perspectives, while weaker ones are more likely to change viewpoints. Additionally, we highlight the importance of a competent judge, such as GPT-4, to draw more proper conclusions. Our work contributes to understanding the inter-consistency among LLMs and lays the foundation for the development of future collaboration methods.

READ FULL TEXT

page 4

page 17

page 18

research
04/22/2023

Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs

Language models have become very popular recently and many claims have b...
research
05/26/2023

Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance

As large language models (LLMs) are continuously being developed, their ...
research
09/10/2021

Tiered Reasoning for Intuitive Physics: Toward Verifiable Commonsense Language Understanding

Large-scale, pre-trained language models (LMs) have achieved human-level...
research
09/15/2023

Self-Consistent Narrative Prompts on Abductive Natural Language Inference

Abduction has long been seen as crucial for narrative comprehension and ...
research
08/17/2023

Semantic Consistency for Assuring Reliability of Large Language Models

Large Language Models (LLMs) exhibit remarkable fluency and competence a...
research
08/15/2021

Accurate, yet inconsistent? Consistency Analysis on Language Understanding Models

Consistency, which refers to the capability of generating the same predi...
research
07/15/2023

Large Language Models as Superpositions of Cultural Perspectives

Large Language Models (LLMs) are often misleadingly recognized as having...

Please sign up or login with your details

Forgot password? Click here to reset