Bayesian Generational Population-Based Training

07/19/2022
by   Xingchen Wan, et al.
11

Reinforcement learning (RL) offers the potential for training generally capable agents that can interact autonomously in the real world. However, one key limitation is the brittleness of RL algorithms to core hyperparameters and network architecture choice. Furthermore, non-stationarities such as evolving training data and increased agent complexity mean that different hyperparameters and architectures may be optimal at different points of training. This motivates AutoRL, a class of methods seeking to automate these design choices. One prominent class of AutoRL methods is Population-Based Training (PBT), which have led to impressive performance in several large scale settings. In this paper, we introduce two new innovations in PBT-style methods. First, we employ trust-region based Bayesian Optimization, enabling full coverage of the high-dimensional mixed hyperparameter search space. Second, we show that using a generational approach, we can also learn both architectures and hyperparameters jointly on-the-fly in a single training run. Leveraging the new highly parallelizable Brax physics engine, we show that these innovations lead to large performance gains, significantly outperforming the tuned baseline while learning entire configurations on the fly. Code is available at https://github.com/xingchenwan/bgpbt.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/30/2021

Tuning Mixed Input Hyperparameters on the Fly for Efficient Population Based AutoRL

Despite a series of recent successes in reinforcement learning (RL), man...
research
12/21/2022

Hyperparameters in Contextual RL are Highly Situational

Although Reinforcement Learning (RL) has shown impressive results in gam...
research
02/06/2020

One-Shot Bayes Opt with Probabilistic Population Based Training

Selecting optimal hyperparameters is a key challenge in machine learning...
research
04/05/2023

AutoRL Hyperparameter Landscapes

Although Reinforcement Learning (RL) has shown to be capable of producin...
research
01/26/2022

Hyperparameter Tuning for Deep Reinforcement Learning Applications

Reinforcement learning (RL) applications, where an agent can simply lear...
research
06/18/2019

Towards White-box Benchmarks for Algorithm Control

The performance of many algorithms in the fields of hard combinatorial p...
research
07/25/2019

Optuna: A Next-generation Hyperparameter Optimization Framework

The purpose of this study is to introduce new design-criteria for next-g...

Please sign up or login with your details

Forgot password? Click here to reset