Reward Design with Language Models

02/27/2023
by   Minae Kwon, et al.
0

Reward design in reinforcement learning (RL) is challenging since specifying human notions of desired behavior may be difficult via reward functions or require many expert demonstrations. Can we instead cheaply design rewards using a natural language interface? This paper explores how to simplify reward design by prompting a large language model (LLM) such as GPT-3 as a proxy reward function, where the user provides a textual prompt containing a few examples (few-shot) or a description (zero-shot) of the desired behavior. Our approach leverages this proxy reward function in an RL framework. Specifically, users specify a prompt once at the beginning of training. During training, the LLM evaluates an RL agent's behavior against the desired behavior described by the prompt and outputs a corresponding reward signal. The RL agent then uses this reward to update its behavior. We evaluate whether our approach can train agents aligned with user objectives in the Ultimatum Game, matrix games, and the DealOrNoDeal negotiation task. In all three tasks, we show that RL agents trained with our framework are well-aligned with the user's objectives and outperform RL agents trained with reward functions learned via supervised learning

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2023

Temporal Video-Language Alignment Network for Reward Shaping in Reinforcement Learning

Designing appropriate reward functions for Reinforcement Learning (RL) a...
research
03/30/2023

Language Models can Solve Computer Tasks

Agents capable of carrying out general tasks on a computer can improve e...
research
11/29/2021

Improving Zero-shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions

Reinforcement learning (RL) agents are widely used for solving complex s...
research
03/19/2023

CLIP4MC: An RL-Friendly Vision-Language Model for Minecraft

One of the essential missions in the AI research community is to build a...
research
01/10/2022

The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models

Reward hacking – where RL agents exploit gaps in misspecified reward fun...
research
10/19/2018

Supervising strong learners by amplifying weak experts

Many real world learning tasks involve complex or hard-to-specify object...
research
02/20/2023

Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems

When learning task-oriented dialogue (ToD) agents, reinforcement learnin...

Please sign up or login with your details

Forgot password? Click here to reset