Data Efficient Training for Reinforcement Learning with Adaptive Behavior Policy Sharing

02/12/2020
by   Ge Liu, et al.
11

Deep Reinforcement Learning (RL) is proven powerful for decision making in simulated environments. However, training deep RL model is challenging in real world applications such as production-scale health-care or recommender systems because of the expensiveness of interaction and limitation of budget at deployment. One aspect of the data inefficiency comes from the expensive hyper-parameter tuning when optimizing deep neural networks. We propose Adaptive Behavior Policy Sharing (ABPS), a data-efficient training algorithm that allows sharing of experience collected by behavior policy that is adaptively selected from a pool of agents trained with an ensemble of hyper-parameters. We further extend ABPS to evolve hyper-parameters during training by hybridizing ABPS with an adapted version of Population Based Training (ABPS-PBT). We conduct experiments with multiple Atari games with up to 16 hyper-parameter/architecture setups. ABPS achieves superior overall performance, reduced variance on top 25 the best agent compared to conventional hyper-parameter tuning with independent training, even though ABPS only requires the same number of environmental interactions as training a single agent. We also show that ABPS-PBT further improves the convergence speed and reduces the variance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/16/2022

Making Reinforcement Learning Work on Swimmer

The SWIMMER environment is a standard benchmark in reinforcement learnin...
research
08/10/2023

A Comparison of Classical and Deep Reinforcement Learning Methods for HVAC Control

Reinforcement learning (RL) is a promising approach for optimizing HVAC ...
research
04/20/2019

Compression and Localization in Reinforcement Learning for ATARI Games

Deep neural networks have become commonplace in the domain of reinforcem...
research
09/18/2019

A Hierarchical Two-tier Approach to Hyper-parameter Optimization in Reinforcement Learning

Optimization of hyper-parameters in reinforcement learning (RL) algorith...
research
02/01/2019

Hyper-parameter Tuning under a Budget Constraint

We study a budgeted hyper-parameter tuning problem, where we optimize th...
research
02/15/2021

Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing

Sharing parameters in multi-agent deep reinforcement learning has played...
research
11/30/2016

The observer-assisted method for adjusting hyper-parameters in deep learning algorithms

This paper presents a concept of a novel method for adjusting hyper-para...

Please sign up or login with your details

Forgot password? Click here to reset