Representation Matters: Offline Pretraining for Sequential Decision Making

02/11/2021
by   Mengjiao Yang, et al.
10

The recent success of supervised learning methods on ever larger offline datasets has spurred interest in the reinforcement learning (RL) field to investigate whether the same paradigms can be translated to RL algorithms. This research area, known as offline RL, has largely focused on offline policy optimization, aiming to find a return-maximizing policy exclusively from offline data. In this paper, we consider a slightly different approach to incorporating offline data into sequential decision-making. We aim to answer the question, what unsupervised objectives applied to offline datasets are able to learn state representations which elevate performance on downstream tasks, whether those downstream tasks be online RL, imitation learning from expert demonstrations, or even offline policy optimization based on the same offline dataset? Through a variety of experiments utilizing standard offline RL datasets, we find that the use of pretraining with unsupervised learning objectives can dramatically improve the performance of policy learning algorithms that otherwise yield mediocre performance on their own. Extensive ablations further provide insights into what components of these unsupervised objectives – e.g., reward prediction, continuous or discrete representations, pretraining or finetuning – are most important and in which settings.

READ FULL TEXT

page 10

page 16

page 17

page 18

page 19

research
01/31/2022

Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning

Recent progress in deep learning has relied on access to large and diver...
research
05/26/2023

Future-conditioned Unsupervised Pretraining for Decision Transformer

Recent research in offline reinforcement learning (RL) has demonstrated ...
research
11/04/2021

RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning

We introduce RLDS (Reinforcement Learning Datasets), an ecosystem for re...
research
04/26/2023

Distance Weighted Supervised Learning for Offline Interaction Data

Sequential decision making algorithms often struggle to leverage differe...
research
11/23/2022

Masked Autoencoding for Scalable and Generalizable Decision Making

We are interested in learning scalable agents for reinforcement learning...
research
04/27/2022

Offline Visual Representation Learning for Embodied Navigation

How should we learn visual representations for embodied agents that must...
research
04/05/2022

Action-Conditioned Contrastive Policy Pretraining

Deep visuomotor policy learning achieves promising results in control ta...

Please sign up or login with your details

Forgot password? Click here to reset