Dual Behavior Regularized Reinforcement Learning

09/19/2021
by   Chapman Siu, et al.
0

Reinforcement learning has been shown to perform a range of complex tasks through interaction with an environment or collected leveraging experience. However, many of these approaches presume optimal or near optimal experiences or the presence of a consistent environment. In this work we propose dual, advantage-based behavior policy based on counterfactual regret minimization. We demonstrate the flexibility of this approach and how it can be adapted to online contexts where the environment is available to collect experiences and a variety of other contexts. We demonstrate this new algorithm can outperform several strong baseline models in different contexts based on a range of continuous environments. Additional ablations provide insights into how our dual behavior regularized reinforcement learning approach is designed compared with other plausible modifications and demonstrates its ability to generalize.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/26/2019

Behavior Regularized Offline Reinforcement Learning

In reinforcement learning (RL) research, it is common to assume access t...
research
12/14/2022

Explaining Agent's Decision-making in a Hierarchical Reinforcement Learning Scenario

Reinforcement learning is a machine learning approach based on behaviora...
research
10/31/2017

Regret Minimization for Partially Observable Deep Reinforcement Learning

Deep reinforcement learning algorithms that estimate state and state-act...
research
10/27/2021

Transfer learning with causal counterfactual reasoning in Decision Transformers

The ability to adapt to changes in environmental contingencies is an imp...
research
11/12/2020

Reinforcement Learning with Videos: Combining Offline Observations with Interaction

Reinforcement learning is a powerful framework for robots to acquire ski...
research
12/16/2020

Sample-Efficient Reinforcement Learning via Counterfactual-Based Data Augmentation

Reinforcement learning (RL) algorithms usually require a substantial amo...
research
06/20/2023

Coevolution of cognition and cooperation in structured populations under reinforcement learning

We study the evolution of behavior under reinforcement learning in a Pri...

Please sign up or login with your details

Forgot password? Click here to reset