Continuous Control With Ensemble Deep Deterministic Policy Gradients

11/30/2021
by   Piotr Januszewski, et al.
0

The growth of deep reinforcement learning (RL) has brought multiple exciting tools and methods to the field. This rapid expansion makes it important to understand the interplay between individual elements of the RL toolbox. We approach this task from an empirical perspective by conducting a study in the continuous control setting. We present multiple insights of fundamental nature, including: an average of multiple actors trained from the same data boosts performance; the existing methods are unstable across training runs, epochs of training, and evaluation runs; a commonly used additive action noise is not required for effective training; a strategy based on posterior sampling explores better than the approximated UCB combined with the weighted Bellman backup; the weighted Bellman backup alone cannot replace the clipped double Q-Learning; the critics' initialization plays the major role in ensemble-based actor-critic exploration. As a conclusion, we show how existing tools can be brought together in a novel way, giving rise to the Ensemble Deep Deterministic Policy Gradients (ED2) method, to yield state-of-the-art results on continuous control tasks from OpenAI Gym MuJoCo. From the practical side, ED2 is conceptually straightforward, easy to code, and does not require knowledge outside of the existing RL toolbox.

READ FULL TEXT
research
10/19/2020

Softmax Deep Double Deterministic Policy Gradients

A widely-used actor-critic reinforcement learning algorithm for continuo...
research
07/09/2020

SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning

Model-free deep reinforcement learning (RL) has been successful in a ran...
research
07/01/2019

FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control

In recent years significant progress has been made in dealing with chall...
research
04/18/2023

Benchmarking Actor-Critic Deep Reinforcement Learning Algorithms for Robotics Control with Action Constraints

This study presents a benchmark for evaluating action-constrained reinfo...
research
09/29/2022

Online Weighted Q-Ensembles for Reduced Hyperparameter Tuning in Reinforcement Learning

Reinforcement learning is a promising paradigm for learning robot contro...
research
08/01/2022

Off-Policy Correction for Actor-Critic Algorithms in Deep Reinforcement Learning

Compared to on-policy policy gradient techniques, off-policy model-free ...
research
11/06/2018

ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search

In this paper, we propose an actor ensemble algorithm, named ACE, for co...

Please sign up or login with your details

Forgot password? Click here to reset