Towards the Scalable Evaluation of Cooperativeness in Language Models

03/16/2023
by   Alan Chan, et al.
0

It is likely that AI systems driven by pre-trained language models (PLMs) will increasingly be used to assist humans in high-stakes interactions with other agents, such as negotiation or conflict resolution. Consistent with the goals of Cooperative AI <cit.>, we wish to understand and shape the multi-agent behaviors of PLMs in a pro-social manner. An important first step is the evaluation of model behaviour across diverse cooperation problems. Since desired behaviour in an interaction depends upon precise game-theoretic structure, we focus on generating scenarios with particular structures with both crowdworkers and a language model. Our work proceeds as follows. First, we discuss key methodological issues in the generation of scenarios corresponding to particular game-theoretic structures. Second, we employ both crowdworkers and a language model to generate such scenarios. We find that the quality of generations tends to be mediocre in both cases. We additionally get both crowdworkers and a language model to judge whether given scenarios align with their intended game-theoretic structure, finding mixed results depending on the game. Third, we provide a dataset of scenario based on our data generated. We provide both quantitative and qualitative evaluations of UnifiedQA and GPT-3 on this dataset. We find that instruct-tuned models tend to act in a way that could be perceived as cooperative when scaled up, while other models seemed to have flat scaling trends.

READ FULL TEXT

page 3

page 8

research
05/13/2023

Investigating Emergent Goal-Like Behaviour in Large Language Models Using Experimental Economics

In this study, we investigate the capacity of large language models (LLM...
research
07/03/2023

Evaluating Shutdown Avoidance of Language Models in Textual Scenarios

Recently, there has been an increase in interest in evaluating large lan...
research
07/18/2022

Word Play for Playing Othello (Reverses)

Language models like OpenAI's Generative Pre-Trained Transformers (GPT-2...
research
07/20/2023

Of Models and Tin Men – a behavioural economics study of principal-agent problems in AI alignment using large-language models

AI Alignment is often presented as an interaction between a single desig...
research
06/02/2023

Evaluating Language Models for Mathematics through Interactions

The standard methodology of evaluating large language models (LLMs) base...
research
08/16/2023

Detoxify Language Model Step-by-Step

Detoxification for LLMs is challenging since it requires models to avoid...
research
08/20/2021

An Empirical Cybersecurity Evaluation of GitHub Copilot's Code Contributions

There is burgeoning interest in designing AI-based systems to assist hum...

Please sign up or login with your details

Forgot password? Click here to reset