Hyperparameters in Reinforcement Learning and How To Tune Them

06/02/2023
by   Theresa Eimer, et al.
0

In order to improve reproducibility, deep reinforcement learning (RL) has been adopting better scientific practices such as standardized evaluation metrics and reporting. However, the process of hyperparameter optimization still varies widely across papers, which makes it challenging to compare RL algorithms fairly. In this paper, we show that hyperparameter choices in RL can significantly affect the agent's final performance and sample efficiency, and that the hyperparameter landscape can strongly depend on the tuning seed which may lead to overfitting. We therefore propose adopting established best practices from AutoML, such as the separation of tuning and testing seeds, as well as principled hyperparameter optimization (HPO) across a broad search space. We support this by comparing multiple state-of-the-art HPO tools on a range of RL algorithms and environments to their hand-tuned counterparts, demonstrating that HPO approaches often have higher performance and lower compute overhead. As a result of our findings, we recommend a set of best practices for the RL community, which should result in stronger empirical results with fewer computational costs, better reproducibility, and thus faster progress. In order to encourage the adoption of these practices, we provide plug-and-play implementations of the tuning algorithms used in this paper at https://github.com/facebookresearch/how-to-autorl.

READ FULL TEXT

page 25

page 29

page 31

page 32

page 34

page 35

page 36

page 38

research
04/05/2023

AutoRL Hyperparameter Landscapes

Although Reinforcement Learning (RL) has shown to be capable of producin...
research
01/26/2022

Hyperparameter Tuning for Deep Reinforcement Learning Applications

Reinforcement learning (RL) applications, where an agent can simply lear...
research
09/03/2020

Sample-Efficient Automated Deep Reinforcement Learning

Despite significant progress in challenging problems across various doma...
research
07/22/2021

Accelerating Quadratic Optimization with Reinforcement Learning

First-order methods for quadratic optimization such as OSQP are widely u...
research
03/09/2023

A Framework for History-Aware Hyperparameter Optimisation in Reinforcement Learning

A Reinforcement Learning (RL) system depends on a set of initial conditi...
research
07/08/2022

Ablation Study of How Run Time Assurance Impacts the Training and Performance of Reinforcement Learning Agents

Reinforcement Learning (RL) has become an increasingly important researc...
research
06/19/2023

AdaStop: sequential testing for efficient and reliable comparisons of Deep RL Agents

The reproducibility of many experimental results in Deep Reinforcement L...

Please sign up or login with your details

Forgot password? Click here to reset