Behavior Regularized Offline Reinforcement Learning

11/26/2019
by   Yifan Wu, et al.
28

In reinforcement learning (RL) research, it is common to assume access to direct online interactions with the environment. However in many real-world applications, access to the environment is limited to a fixed offline dataset of logged experience. In such settings, standard RL algorithms have been shown to diverge or otherwise yield poor performance. Accordingly, recent work has suggested a number of remedies to these issues. In this work, we introduce a general framework, behavior regularized actor critic (BRAC), to empirically evaluate recently proposed methods as well as a number of simple baselines across a variety of offline continuous control tasks. Surprisingly, we find that many of the technical complexities introduced in recent methods are unnecessary to achieve strong performance. Additional ablations provide insights into which design choices matter most in the offline RL setting.

READ FULL TEXT

page 9

page 17

page 18

page 19

page 20

research
06/26/2020

Critic Regularized Regression

Offline reinforcement learning (RL), also known as batch RL, offers the ...
research
09/19/2021

Dual Behavior Regularized Reinforcement Learning

Reinforcement learning has been shown to perform a range of complex task...
research
07/21/2020

EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL

Off-policy reinforcement learning (RL) holds the promise of sample-effic...
research
03/03/2023

Learning to Influence Human Behavior with Offline Reinforcement Learning

In the real world, some of the most complex settings for learned agents ...
research
10/26/2020

OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning

Reinforcement learning (RL) has achieved impressive performance in a var...
research
07/05/2023

LLQL: Logistic Likelihood Q-Learning for Reinforcement Learning

Currently, research on Reinforcement learning (RL) can be broadly classi...
research
06/16/2021

Offline RL Without Off-Policy Evaluation

Most prior approaches to offline reinforcement learning (RL) have taken ...

Please sign up or login with your details

Forgot password? Click here to reset