Conservative Data Sharing for Multi-Task Offline Reinforcement Learning

09/16/2021
by   Tianhe Yu, et al.
8

Offline reinforcement learning (RL) algorithms have shown promising results in domains where abundant pre-collected data is available. However, prior methods focus on solving individual problems from scratch with an offline dataset without considering how an offline RL agent can acquire multiple skills. We argue that a natural use case of offline RL is in settings where we can pool large amounts of data collected in various scenarios for solving different tasks, and utilize all of this data to learn behaviors for all the tasks more effectively rather than training each one in isolation. However, sharing data across all tasks in multi-task offline RL performs surprisingly poorly in practice. Thorough empirical analysis, we find that sharing data can actually exacerbate the distributional shift between the learned policy and the dataset, which in turn can lead to divergence of the learned policy and poor performance. To address this challenge, we develop a simple technique for data-sharing in multi-task offline RL that routes data based on the improvement over the task-specific data. We call this approach conservative data sharing (CDS), and it can be applied with multiple single-task offline RL methods. On a range of challenging multi-task locomotion, navigation, and vision-based robotic manipulation problems, CDS achieves the best or comparable performance compared to prior offline multi-task RL methods and previous data sharing approaches.

READ FULL TEXT

page 8

page 22

research
05/06/2022

How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation

Reinforcement learning (RL) has been shown to be effective at learning c...
research
10/11/2022

Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials

Recent progress in deep learning highlights the tremendous potential of ...
research
05/29/2023

Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

Diffusion models have demonstrated highly-expressive generative capabili...
research
11/28/2022

Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes

The potential of offline reinforcement learning (RL) is that high-capaci...
research
02/26/2020

Generalized Hindsight for Reinforcement Learning

One of the key reasons for the high sample complexity in reinforcement l...
research
10/22/2020

Batch Exploration with Examples for Scalable Robotic Reinforcement Learning

Learning from diverse offline datasets is a promising path towards learn...
research
07/26/2022

Offline Reinforcement Learning at Multiple Frequencies

Leveraging many sources of offline robot data requires grappling with th...

Please sign up or login with your details

Forgot password? Click here to reset