Implicit Offline Reinforcement Learning via Supervised Learning

10/21/2022
by   Alexandre Piché, et al.
0

Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset collected by policies of different expertise levels. It is as simple as supervised learning and Behavior Cloning (BC), but takes advantage of return information. On datasets collected by policies of similar expertise, implicit BC has been shown to match or outperform explicit BC. Despite the benefits of using implicit models to learn robotic skills via BC, offline RL via Supervised Learning algorithms have been limited to explicit models. We show how implicit models can leverage return information and match or outperform explicit algorithms to acquire robotic skills from fixed datasets. Furthermore, we show the close relationship between our implicit methods and other popular RL via Supervised Learning algorithms to provide a unified framework. Finally, we demonstrate the effectiveness of our method on high-dimension manipulation and locomotion tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/22/2021

A Workflow for Offline Model-Free Robotic Reinforcement Learning

Offline reinforcement learning (RL) enables learning control policies by...
research
12/09/2021

DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization

Despite overparameterization, deep networks trained via supervised learn...
research
04/12/2022

When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?

Offline reinforcement learning (RL) algorithms can acquire effective pol...
research
02/23/2022

Learning Relative Return Policies With Upside-Down Reinforcement Learning

Lately, there has been a resurgence of interest in using supervised lear...
research
11/26/2021

Measuring Data Quality for Dataset Selection in Offline Reinforcement Learning

Recently developed offline reinforcement learning algorithms have made i...
research
01/30/2023

Winning Solution of Real Robot Challenge III

This report introduces our winning solution of the real-robot phase of t...
research
09/01/2021

Implicit Behavioral Cloning

We find that across a wide range of robot policy learning scenarios, tre...

Please sign up or login with your details

Forgot password? Click here to reset