Ablation Study of How Run Time Assurance Impacts the Training and Performance of Reinforcement Learning Agents

07/08/2022
by   Nathaniel Hamilton, et al.
2

Reinforcement Learning (RL) has become an increasingly important research area as the success of machine learning algorithms and methods grows. To combat the safety concerns surrounding the freedom given to RL agents while training, there has been an increase in work concerning Safe Reinforcement Learning (SRL). However, these new and safe methods have been held to less scrutiny than their unsafe counterparts. For instance, comparisons among safe methods often lack fair evaluation across similar initial condition bounds and hyperparameter settings, use poor evaluation metrics, and cherry-pick the best training runs rather than averaging over multiple random seeds. In this work, we conduct an ablation study using evaluation best practices to investigate the impact of run time assurance (RTA), which monitors the system state and intervenes to assure safety, on effective learning. By studying multiple RTA approaches in both on-policy and off-policy RL algorithms, we seek to understand which RTA methods are most effective, whether the agents become dependent on the RTA, and the importance of reward shaping versus safe exploration in RL agent training. Our conclusions shed light on the most promising directions of SRL, and our evaluation methodology lays the groundwork for creating better comparisons in future SRL work.

READ FULL TEXT

page 22

page 24

page 25

page 27

page 30

page 33

page 36

page 39

research
05/29/2022

On the Robustness of Safe Reinforcement Learning under Observational Perturbations

Safe reinforcement learning (RL) trains a policy to maximize the task re...
research
02/26/2021

Safe Distributional Reinforcement Learning

Safety in reinforcement learning (RL) is a key property in both training...
research
03/24/2023

Safe and Sample-efficient Reinforcement Learning for Clustered Dynamic Environments

This study proposes a safe and sample-efficient reinforcement learning (...
research
09/10/2022

Safe Reinforcement Learning with Contrastive Risk Prediction

As safety violations can lead to severe consequences in real-world robot...
research
08/04/2021

Learning Barrier Certificates: Towards Safe Reinforcement Learning with Zero Training-time Violations

Training-time safety violations have been a major concern when we deploy...
research
06/03/2021

Towards Learning to Play Piano with Dexterous Hands and Touch

The virtuoso plays the piano with passion, poetry and extraordinary tech...
research
06/02/2023

Hyperparameters in Reinforcement Learning and How To Tune Them

In order to improve reproducibility, deep reinforcement learning (RL) ha...

Please sign up or login with your details

Forgot password? Click here to reset