A Reminder of its Brittleness: Language Reward Shaping May Hinder Learning for Instruction Following Agents

05/26/2023
by   Sukai Huang, et al.
0

Teaching agents to follow complex written instructions has been an important yet elusive goal. One technique for improving learning efficiency is language reward shaping (LRS), which is used in reinforcement learning (RL) to reward actions that represent progress towards a sparse reward. We argue that the apparent success of LRS is brittle, and prior positive findings can be attributed to weak RL baselines. Specifically, we identified suboptimal LRS designs that reward partially matched trajectories, and we characterised a novel type of reward perturbation that addresses this issue based on the concept of loosening task constraints. We provided theoretical and empirical evidence that agents trained using LRS rewards converge more slowly compared to pure RL agents.

READ FULL TEXT

page 2

page 8

page 15

page 23

research
02/08/2023

Temporal Video-Language Alignment Network for Reward Shaping in Reinforcement Learning

Designing appropriate reward functions for Reinforcement Learning (RL) a...
research
05/18/2019

Evolving Rewards to Automate Reinforcement Learning

Many continuous control tasks have easily formulated objectives, yet usi...
research
11/08/2022

Learning to Follow Instructions in Text-Based Games

Text-based games present a unique class of sequential decision making pr...
research
10/04/2022

Handling Sparse Rewards in Reinforcement Learning Using Model Predictive Control

Reinforcement learning (RL) has recently proven great success in various...
research
04/20/2022

Understanding and Preventing Capacity Loss in Reinforcement Learning

The reinforcement learning (RL) problem is rife with sources of non-stat...
research
02/09/2023

Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals

High sample complexity has long been a challenge for RL. On the other ha...
research
03/19/2023

CLIP4MC: An RL-Friendly Vision-Language Model for Minecraft

One of the essential missions in the AI research community is to build a...

Please sign up or login with your details

Forgot password? Click here to reset