Simple random search provides a competitive approach to reinforcement learning

03/19/2018
by   Horia Mania, et al.
0

A common belief in model-free reinforcement learning is that methods based on random search in the parameter space of policies exhibit significantly worse sample complexity than those that explore the space of actions. We dispel such beliefs by introducing a random search method for training static, linear policies for continuous control problems, matching state-of-the-art sample efficiency on the benchmark MuJoCo locomotion tasks. Our method also finds a nearly optimal controller for a challenging instance of the Linear Quadratic Regulator, a classical problem in control theory, when the dynamics are not known. Computationally, our random search algorithm is at least 15 times more efficient than the fastest competing model-free methods on these benchmarks. We take advantage of this computational efficiency to evaluate the performance of our method over hundreds of random seeds and many different hyperparameter configurations for each benchmark task. Our simulations highlight a high variability in performance in these benchmark tasks, suggesting that commonly used estimations of sample efficiency do not adequately evaluate the performance of RL algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2019

Augmented Random Search for Quadcopter Control: An alternative to Reinforcement Learning

Model-based reinforcement learning strategies are believed to exhibit mo...
research
01/15/2020

SEERL: Sample Efficient Ensemble Reinforcement Learning

Ensemble learning is a very prevalent method employed in machine learnin...
research
12/26/2019

Convergence and sample complexity of gradient methods for the model-free linear quadratic regulator problem

Model-free reinforcement learning attempts to find an optimal control ac...
research
03/02/2016

Continuous Deep Q-Learning with Model-based Acceleration

Model-free reinforcement learning has been successfully applied to a ran...
research
06/15/2023

Simplified Temporal Consistency Reinforcement Learning

Reinforcement learning is able to solve complex sequential decision-maki...
research
02/23/2023

Reinforcement Learning for Combining Search Methods in the Calibration of Economic ABMs

Calibrating agent-based models (ABMs) in economics and finance typically...
research
07/27/2022

PI-ARS: Accelerating Evolution-Learned Visual-Locomotion with Predictive Information Representations

Evolution Strategy (ES) algorithms have shown promising results in train...

Please sign up or login with your details

Forgot password? Click here to reset