DeepAI AI Chat
Log In Sign Up

Collaborative Evolutionary Reinforcement Learning

by   Shauharda Khadka, et al.

Deep reinforcement learning algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically struggle with achieving effective exploration and are extremely sensitive to the choice of hyperparameters. One reason is that most approaches use a noisy version of their operating policy to explore - thereby limiting the range of exploration. In this paper, we introduce Collaborative Evolutionary Reinforcement Learning (CERL), a scalable framework that comprises a portfolio of policies that simultaneously explore and exploit diverse regions of the solution space. A collection of learners - typically proven algorithms like TD3 - optimize over varying time-horizons leading to this diverse portfolio. All learners contribute to and use a shared replay buffer to achieve greater sample efficiency. Computational resources are dynamically distributed to favor the best learners as a form of online algorithm selection. Neuroevolution binds this entire process to generate a single emergent learner that exceeds the capabilities of any individual learner. Experiments in a range of continuous control benchmarks demonstrate that the emergent learner significantly outperforms its composite learners while remaining overall more sample-efficient - notably solving the Mujoco Humanoid benchmark where all of its composite learners (TD3) fail entirely in isolation.


page 1

page 2

page 3

page 4


Evolutionary Reinforcement Learning

Deep Reinforcement Learning (DRL) algorithms have been successfully appl...

Recruitment-imitation Mechanism for Evolutionary Reinforcement Learning

Reinforcement learning, evolutionary algorithms and imitation learning a...

Metaoptimization on a Distributed System for Deep Reinforcement Learning

Training intelligent agents through reinforcement learning is a notoriou...

Hierarchical Training of Deep Ensemble Policies for Reinforcement Learning in Continuous Spaces

Many actor-critic deep reinforcement learning (DRL) algorithms have achi...

ZPD Teaching Strategies for Deep Reinforcement Learning from Demonstrations

Learning from demonstrations is a popular tool for accelerating and redu...

Efficient Lifelong Learning with A-GEM

In lifelong learning, the learner is presented with a sequence of tasks,...