Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance

11/17/2021
by   Yanqiu Wu, et al.
13

Recently, Truncated Quantile Critics (TQC), using distributional representation of critics, was shown to provide state-of-the-art asymptotic training performance on all environments from the MuJoCo continuous control benchmark suite. Also recently, Randomized Ensemble Double Q-Learning (REDQ), using a high update-to-data ratio and target randomization, was shown to achieve high sample efficiency that is competitive with state-of-the-art model-based methods. In this paper, we propose a novel model-free algorithm, Aggressive Q-Learning with Ensembles (AQE), which improves the sample-efficiency performance of REDQ and the asymptotic performance of TQC, thereby providing overall state-of-the-art performance during all stages of training. Moreover, AQE is very simple, requiring neither distributional representation of critics nor target randomization.

READ FULL TEXT

page 8

page 14

page 15

page 16

research
01/15/2021

Randomized Ensembled Double Q-Learning: Learning Fast Without a Model

Using a high Update-To-Data (UTD) ratio, model-based methods have recent...
research
05/08/2020

Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics

The overestimation bias is one of the major impediments to accurate off-...
research
06/25/2019

Uncertainty-aware Model-based Policy Optimization

Model-based reinforcement learning has the potential to be more sample e...
research
03/24/2020

Distributional Reinforcement Learning with Ensembles

It is well-known that ensemble methods often provide enhanced performanc...
research
06/13/2022

IGN : Implicit Generative Networks

In this work, we build recent advances in distributional reinforcement l...
research
07/15/2021

Statistical modeling of corneal OCT speckle. A distributional model-free approach

In biomedical optics, it is often of interest to statistically model the...
research
01/15/2020

SEERL: Sample Efficient Ensemble Reinforcement Learning

Ensemble learning is a very prevalent method employed in machine learnin...

Please sign up or login with your details

Forgot password? Click here to reset