Reinforcement Learning under Threats

09/05/2018
by   Víctor Gallego, et al.
0

In several reinforcement learning (RL) scenarios, mainly in security settings, there may be adversaries trying to interfere with the reward generating process. In this paper, we introduce Threatened Markov Decision Processes (TMDPs), which provide a framework to support a decision maker against a potential adversary in RL. Furthermore, we propose a level-k thinking scheme resulting in a new learning framework to deal with TMDPs. After introducing our framework and deriving theoretical results, relevant empirical evidence is given via extensive experiments, showing the benefits of accounting for adversaries while the agent learns.

READ FULL TEXT
research
08/22/2019

Opponent Aware Reinforcement Learning

In several reinforcement learning (RL) scenarios such as security settin...
research
01/21/2021

Robust Reinforcement Learning on State Observations with Learned Optimal Adversary

We study the robustness of reinforcement learning (RL) with adversariall...
research
01/15/2020

Lipschitz Lifelong Reinforcement Learning

We consider the problem of knowledge transfer when an agent is facing a ...
research
06/10/2015

The Online Coupon-Collector Problem and Its Application to Lifelong Reinforcement Learning

Transferring knowledge across a sequence of related tasks is an importan...
research
04/25/2019

Reward-Based Deception with Cognitive Bias

Deception plays a key role in adversarial or strategic interactions for ...
research
09/07/2016

Unifying task specification in reinforcement learning

Reinforcement learning tasks are typically specified as Markov decision ...
research
01/14/2018

Deep Reinforcement Fuzzing

Fuzzing is the process of finding security vulnerabilities in input-proc...

Please sign up or login with your details

Forgot password? Click here to reset