Online Safety Assurance for Deep Reinforcement Learning

10/07/2020
by   Noga H. Rotman, et al.
0

Recently, deep learning has been successfully applied to a variety of networking problems. A fundamental challenge is that when the operational environment for a learning-augmented system differs from its training environment, such systems often make badly informed decisions, leading to bad performance. We argue that safely deploying learning-driven systems requires being able to determine, in real time, whether system behavior is coherent, for the purpose of defaulting to a reasonable heuristic when this is not so. We term this the online safety assurance problem (OSAP). We present three approaches to quantifying decision uncertainty that differ in terms of the signal used to infer uncertainty. We illustrate the usefulness of online safety assurance in the context of the proposed deep reinforcement learning (RL) approach to video streaming. While deep RL for video streaming bests other approaches when the operational and training environments match, it is dominated by simple heuristics when the two differ. Our preliminary findings suggest that transitioning to a default policy when decision uncertainty is detected is key to enjoying the performance benefits afforded by leveraging ML without compromising on safety.

READ FULL TEXT
research
05/12/2022

Provably Safe Deep Reinforcement Learning for Robotic Manipulation in Human Environments

Deep reinforcement learning (RL) has shown promising results in the moti...
research
07/02/2019

Generalizing from a few environments in safety-critical reinforcement learning

Before deploying autonomous agents in the real world, we need to be conf...
research
07/02/2019

Dynamic Face Video Segmentation via Reinforcement Learning

For real-time semantic video segmentation, most recent works utilise a d...
research
10/12/2022

Explaining Online Reinforcement Learning Decisions of Self-Adaptive Systems

Design time uncertainty poses an important challenge when developing a s...
research
01/14/2022

Reinforcement Learning in Time-Varying Systems: an Empirical Study

Recent research has turned to Reinforcement Learning (RL) to solve chall...
research
06/05/2023

Conformal Predictive Safety Filter for RL Controllers in Dynamic Environments

The interest in using reinforcement learning (RL) controllers in safety-...
research
07/09/2023

A User Study on Explainable Online Reinforcement Learning for Adaptive Systems

Online reinforcement learning (RL) is increasingly used for realizing ad...

Please sign up or login with your details

Forgot password? Click here to reset