Accelerating Online Reinforcement Learning with Offline Datasets

06/16/2020
by   Ashvin Nair, et al.
0

Reinforcement learning provides an appealing formalism for learning control policies from experience. However, the classic active formulation of reinforcement learning necessitates a lengthy active exploration process for each behavior, making it difficult to apply in real-world settings. If we can instead allow reinforcement learning to effectively use previously collected data to aid the online learning process, where the data could be expert demonstrations or more generally any prior experience, we could make reinforcement learning a substantially more practical tool. While a number of recent methods have sought to learn offline from previously collected data, it remains exceptionally difficult to train a policy with offline data and improve it further with online reinforcement learning. In this paper we systematically analyze why this problem is so challenging, and propose a novel algorithm that combines sample-efficient dynamic programming with maximum likelihood policy updates, providing a simple and effective framework that is able to leverage large amounts of offline data and then quickly perform online fine-tuning of reinforcement learning policies. We show that our method enables rapid learning of skills with a combination of prior demonstration data and online experience across a suite of difficult dexterous manipulation and benchmark tasks.

READ FULL TEXT
research
07/28/2023

Benchmarking Offline Reinforcement Learning on Real-Robot Hardware

Learning policies from previously recorded data is a promising direction...
research
11/12/2020

Reinforcement Learning with Videos: Combining Offline Observations with Interaction

Reinforcement learning is a powerful framework for robots to acquire ski...
research
06/08/2016

Safe and Efficient Off-Policy Reinforcement Learning

In this work, we take a fresh look at some old and new algorithms for of...
research
10/10/2021

A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets

Recent Offline Reinforcement Learning methods have succeeded in learning...
research
07/10/2023

Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data

In some applications of reinforcement learning, a dataset of pre-collect...
research
05/11/2018

Interactive Reinforcement Learning with Dynamic Reuse of Prior Knowledge from Human/Agent's Demonstration

Reinforcement learning has enjoyed multiple successes in recent years. H...
research
03/10/2021

S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement Learning

Offline reinforcement learning proposes to learn policies from large col...

Please sign up or login with your details

Forgot password? Click here to reset