Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions

03/30/2023
by   Yicheng Luo, et al.
0

Offline reinforcement learning (RL) allows for the training of competent agents from offline datasets without any interaction with the environment. Online finetuning of such offline models can further improve performance. But how should we ideally finetune agents obtained from offline RL training? While offline RL algorithms can in principle be used for finetuning, in practice, their online performance improves slowly. In contrast, we show that it is possible to use standard online off-policy algorithms for faster improvement. However, we find this approach may suffer from policy collapse, where the policy undergoes severe performance deterioration during initial online learning. We investigate the issue of policy collapse and how it relates to data diversity, algorithm choices and online replay distribution. Based on these insights, we propose a conservative policy optimization procedure that can achieve stable and sample-efficient online learning from offline pretraining.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2021

Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble

Recent advance in deep offline reinforcement learning (RL) has made it p...
research
05/25/2023

PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning

Offline-to-online reinforcement learning (RL), by combining the benefits...
research
09/25/2022

On the Opportunities and Challenges of using Animals Videos in Reinforcement Learning

We investigate the use of animals videos to improve efficiency and perfo...
research
01/25/2022

MOORe: Model-based Offline-to-Online Reinforcement Learning

With the success of offline reinforcement learning (RL), offline trained...
research
04/18/2023

Using Offline Data to Speed-up Reinforcement Learning in Procedurally Generated Environments

One of the key challenges of Reinforcement Learning (RL) is the ability ...
research
12/15/2022

Bridging the Gap Between Offline and Online Reinforcement Learning Evaluation Methodologies

Reinforcement learning (RL) has shown great promise with algorithms lear...
research
02/02/2023

Policy Expansion for Bridging Offline-to-Online Reinforcement Learning

Pre-training with offline data and online fine-tuning using reinforcemen...

Please sign up or login with your details

Forgot password? Click here to reset