Inferring Rewards from Language in Context

04/05/2022
by   Jessy Lin, et al.
2

In classic instruction following, language like "I'd like the JetBlue flight" maps to actions (e.g., selecting that flight). However, language also conveys information about a user's underlying reward function (e.g., a general preference for JetBlue), which can allow a model to carry out desirable actions in new contexts. We present a model that infers rewards from language pragmatically: reasoning about how speakers choose utterances not only to elicit desired actions, but also to reveal information about their preferences. On a new interactive flight-booking task with natural language, our model more accurately infers rewards and predicts optimal actions in unseen environments, in comparison to past work that first maps language to actions (instruction following) and then maps actions to rewards (inverse reinforcement learning).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/16/2020

Inverse Reinforcement Learning with Natural Language Goals

Humans generally use natural language to communicate task requirements a...
research
02/20/2019

From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following

Reinforcement learning is a promising framework for solving control prob...
research
04/11/2022

Linguistic communication as (inverse) reward design

Natural language is an intuitive and expressive way to communicate rewar...
research
08/27/2019

Deep Reinforcement Learning for Chatbots Using Clustered Actions and Human-Likeness Rewards

Training chatbots using the reinforcement learning paradigm is challengi...
research
10/21/2019

Learning to Map Natural Language Instructions to Physical Quadcopter Control using Simulated Flight

We propose a joint simulation and real-world learning framework for mapp...
research
10/09/2021

Active Altruism Learning and Information Sufficiency for Autonomous Driving

Safe interaction between vehicles requires the ability to choose actions...
research
05/04/2023

Language, Time Preferences, and Consumer Behavior: Evidence from Large Language Models

Language has a strong influence on our perceptions of time and rewards. ...

Please sign up or login with your details

Forgot password? Click here to reset