Hoodwinked: Deception and Cooperation in a Text-Based Game for Language Models

07/05/2023
by   Aidan O'Gara, et al.
0

Are current language models capable of deception and lie detection? We study this question by introducing a text-based game called Hoodwinked, inspired by Mafia and Among Us. Players are locked in a house and must find a key to escape, but one player is tasked with killing the others. Each time a murder is committed, the surviving players have a natural language discussion then vote to banish one player from the game. We conduct experiments with agents controlled by GPT-3, GPT-3.5, and GPT-4 and find evidence of deception and lie detection capabilities. The killer often denies their crime and accuses others, leading to measurable effects on voting outcomes. More advanced models are more effective killers, outperforming smaller models in 18 of 24 pairwise comparisons. Secondary metrics provide evidence that this improvement is not mediated by different actions, but rather by stronger persuasive skills during discussions. To evaluate the ability of AI agents to deceive humans, we make this game publicly available at h https://hoodwinked.ai/ .

READ FULL TEXT
research
02/21/2023

Playing the Werewolf game with artificial intelligence for language understanding

The Werewolf game is a social deduction game based on free natural langu...
research
07/25/2022

WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models

While vision-and-language models perform well on tasks such as visual qu...
research
09/24/2022

Learning Chess With Language Models and Transformers

Representing a board game and its positions by text-based notation enabl...
research
08/15/2023

CALYPSO: LLMs as Dungeon Masters' Assistants

The role of a Dungeon Master, or DM, in the game Dungeons Dragons is...
research
01/08/2019

The power of moral words: Loaded language generates framing effects in the extreme dictator game

Understanding whether preferences are sensitive to the frame has been a ...
research
05/17/2023

Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback

We study whether multiple large language models (LLMs) can autonomously ...
research
05/13/2023

Investigating Emergent Goal-Like Behaviour in Large Language Models Using Experimental Economics

In this study, we investigate the capacity of large language models (LLM...

Please sign up or login with your details

Forgot password? Click here to reset