Online Weighted Q-Ensembles for Reduced Hyperparameter Tuning in Reinforcement Learning

09/29/2022
by   Renata Garcia, et al.
0

Reinforcement learning is a promising paradigm for learning robot control, allowing complex control policies to be learned without requiring a dynamics model. However, even state of the art algorithms can be difficult to tune for optimum performance. We propose employing an ensemble of multiple reinforcement learning agents, each with a different set of hyperparameters, along with a mechanism for choosing the best performing set(s) on-line. In the literature, the ensemble technique is used to improve performance in general, but the current work specifically addresses decreasing the hyperparameter tuning effort. Furthermore, our approach targets on-line learning on a single robotic system, and does not require running multiple simulators in parallel. Although the idea is generic, the Deep Deterministic Policy Gradient was the model chosen, being a representative deep learning actor-critic method with good performance in continuous action settings but known high variance. We compare our online weighted q-ensemble approach to q-average ensemble strategies addressed in literature using alternate policy training, as well as online training, demonstrating the advantage of the new approach in eliminating hyperparameter tuning. The applicability to real-world systems was validated in common robotic benchmark environments: the bipedal robot half cheetah and the swimmer. Online Weighted Q-Ensemble presented overall lower variance and superior results when compared with q-average ensembles using randomized parameterizations.

READ FULL TEXT

page 8

page 10

research
12/26/2018

Learning to Walk via Deep Reinforcement Learning

Deep reinforcement learning suggests the promise of fully automated lear...
research
12/25/2017

Learning to Run with Actor-Critic Ensemble

We introduce an Actor-Critic Ensemble(ACE) method for improving the perf...
research
09/26/2022

DEFT: Diverse Ensembles for Fast Transfer in Reinforcement Learning

Deep ensembles have been shown to extend the positive effect seen in typ...
research
11/30/2021

Continuous Control With Ensemble Deep Deterministic Policy Gradients

The growth of deep reinforcement learning (RL) has brought multiple exci...
research
05/21/2014

Off-Policy Shaping Ensembles in Reinforcement Learning

Recent advances of gradient temporal-difference methods allow to learn o...
research
03/26/2023

Balancing policy constraint and ensemble size in uncertainty-based offline reinforcement learning

Offline reinforcement learning agents seek optimal policies from fixed d...
research
01/15/2020

SEERL: Sample Efficient Ensemble Reinforcement Learning

Ensemble learning is a very prevalent method employed in machine learnin...

Please sign up or login with your details

Forgot password? Click here to reset