Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement

02/25/2020
by   Benjamin Eysenbach, et al.
11

Multi-task reinforcement learning (RL) aims to simultaneously learn policies for solving many tasks. Several prior works have found that relabeling past experience with different reward functions can improve sample efficiency. Relabeling methods typically ask: if, in hindsight, we assume that our experience was optimal for some task, for what task was it optimal? In this paper, we show that hindsight relabeling is inverse RL, an observation that suggests that we can use inverse RL in tandem for RL algorithms to efficiently solve many tasks. We use this idea to generalize goal-relabeling techniques from prior work to arbitrary classes of tasks. Our experiments confirm that relabeling data using inverse RL accelerates learning in general multi-task settings, including goal-reaching, domains with discrete sets of rewards, and those with linear reward functions.

READ FULL TEXT

page 7

page 15

page 16

research
10/30/2019

Policy Continuation with Hindsight Inverse Dynamics

Solving goal-oriented tasks is an important but challenging problem in r...
research
11/29/2015

Robotic Search & Rescue via Online Multi-task Reinforcement Learning

Reinforcement learning (RL) is a general and well-known method that a ro...
research
02/26/2020

Generalized Hindsight for Reinforcement Learning

One of the key reasons for the high sample complexity in reinforcement l...
research
01/04/2020

Hierarchical Reinforcement Learning as a Model of Human Task Interleaving

How do people decide how long to continue in a task, when to switch, and...
research
04/27/2020

Maximum Entropy Multi-Task Inverse RL

Multi-task IRL allows for the possibility that the expert could be switc...
research
07/17/2021

Hierarchical Reinforcement Learning with Optimal Level Synchronization based on a Deep Generative Model

The high-dimensional or sparse reward task of a reinforcement learning (...
research
07/01/2019

On mechanisms for transfer using landmark value functions in multi-task lifelong reinforcement learning

Transfer learning across different reinforcement learning (RL) tasks is ...

Please sign up or login with your details

Forgot password? Click here to reset