Learning to Run with Actor-Critic Ensemble

12/25/2017
by   Zhewei Huang, et al.
0

We introduce an Actor-Critic Ensemble(ACE) method for improving the performance of Deep Deterministic Policy Gradient(DDPG) algorithm. At inference time, our method uses a critic ensemble to select the best action from proposals of multiple actors running in parallel. By having a larger candidate set, our method can avoid actions that have fatal consequences, while staying deterministic. Using ACE, we have won the 2nd place in NIPS'17 Learning to Run competition, under the name of "Megvii-hzwer".

READ FULL TEXT

page 1

page 2

page 3

research
10/24/2022

AACHER: Assorted Actor-Critic Deep Reinforcement Learning with Hindsight Experience Replay

Actor learning and critic learning are two components of the outstanding...
research
07/23/2019

Variance Reduction in Actor Critic Methods (ACM)

After presenting Actor Critic Methods (ACM), we show ACM are control var...
research
11/06/2018

ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search

In this paper, we propose an actor ensemble algorithm, named ACE, for co...
research
02/26/2020

When Do Drivers Concentrate? Attention-based Driver Behavior Modeling With Deep Reinforcement Learning

Driver distraction a significant risk to driving safety. Apart from spat...
research
09/29/2022

Online Weighted Q-Ensembles for Reduced Hyperparameter Tuning in Reinforcement Learning

Reinforcement learning is a promising paradigm for learning robot contro...
research
09/09/2019

AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers

The exploration mechanism used by a Deep Reinforcement Learning (RL) age...
research
06/16/2019

ASAC: Active Sensing using Actor-Critic models

Deciding what and when to observe is critical when making observations i...

Please sign up or login with your details

Forgot password? Click here to reset