SEERL: Sample Efficient Ensemble Reinforcement Learning

01/15/2020
by   Rohan Saphal, et al.
43

Ensemble learning is a very prevalent method employed in machine learning. The relative success of ensemble methods is attributed to its ability to tackle a wide range of instances and complex problems that require different low-level approaches. However, ensemble methods are relatively less popular in reinforcement learning owing to the high sample complexity and computational expense involved. We present a new training and evaluation framework for model-free algorithms that use ensembles of policies obtained from a single training instance. These policies are diverse in nature and are learned through directed perturbation of the model parameters at regular intervals. We show that learning an adequately diverse set of policies is required for a good ensemble while extreme diversity can prove detrimental to overall performance. We evaluate our approach to challenging discrete and continuous control tasks and also discuss various ensembling strategies. Our framework is substantially sample efficient, computationally inexpensive and is seen to outperform state of the art(SOTA) scores in Atari 2600 and Mujoco. Video results can be found at https://www.youtube.com/channel/UC95Kctu9Mp8BlFmtGD2TGTA

READ FULL TEXT

page 8

page 9

research
03/19/2018

Simple random search provides a competitive approach to reinforcement learning

A common belief in model-free reinforcement learning is that methods bas...
research
02/28/2018

Model-Ensemble Trust-Region Policy Optimization

Model-free reinforcement learning (RL) methods are succeeding in a growi...
research
05/25/2019

Composing Ensembles of Policies with Deep Reinforcement Learning

Composition of elementary skills into complex behaviors to solve challen...
research
02/11/2015

Off-Policy Reward Shaping with Ensembles

Potential-based reward shaping (PBRS) is an effective and popular techni...
research
07/05/2021

Sample Efficient Reinforcement Learning via Model-Ensemble Exploration and Exploitation

Model-based deep reinforcement learning has achieved success in various ...
research
09/29/2022

Online Weighted Q-Ensembles for Reduced Hyperparameter Tuning in Reinforcement Learning

Reinforcement learning is a promising paradigm for learning robot contro...
research
11/17/2021

Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance

Recently, Truncated Quantile Critics (TQC), using distributional represe...

Please sign up or login with your details

Forgot password? Click here to reset