Reward Delay Attacks on Deep Reinforcement Learning

09/08/2022
by   Anindya Sarkar, et al.
0

Most reinforcement learning algorithms implicitly assume strong synchrony. We present novel attacks targeting Q-learning that exploit a vulnerability entailed by this assumption by delaying the reward signal for a limited time period. We consider two types of attack goals: targeted attacks, which aim to cause a target policy to be learned, and untargeted attacks, which simply aim to induce a policy with a low reward. We evaluate the efficacy of the proposed attacks through a series of experiments. Our first observation is that reward-delay attacks are extremely effective when the goal is simply to minimize reward. Indeed, we find that even naive baseline reward-delay attacks are also highly successful in minimizing the reward. Targeted attacks, on the other hand, are more challenging, although we nevertheless demonstrate that the proposed approaches remain highly effective at achieving the attacker's targets. In addition, we introduce a second threat model that captures a minimal mitigation that ensures that rewards cannot be used out of sequence. We find that this mitigation remains insufficient to ensure robustness to attacks that delay, but preserve the order, of rewards.

READ FULL TEXT

page 5

page 6

research
01/03/2022

Execute Order 66: Targeted Data Poisoning for Reinforcement Learning

Data poisoning for reinforcement learning has historically focused on ge...
research
08/29/2022

Understanding the Limits of Poisoning Attacks in Episodic Reinforcement Learning

To understand the security threats to reinforcement learning (RL) algori...
research
02/10/2021

Defense Against Reward Poisoning Attacks in Reinforcement Learning

We study defense strategies against reward poisoning attacks in reinforc...
research
03/27/2020

Adaptive Reward-Poisoning Attacks against Reinforcement Learning

In reward-poisoning attacks against reinforcement learning (RL), an atta...
research
12/02/2021

Reward-Free Attacks in Multi-Agent Reinforcement Learning

We investigate how effective an attacker can be when it only learns from...
research
09/06/2021

Hindsight Reward Tweaking via Conditional Deep Reinforcement Learning

Designing optimal reward functions has been desired but extremely diffic...
research
06/03/2019

Adversarial Exploitation of Policy Imitation

This paper investigates a class of attacks targeting the confidentiality...

Please sign up or login with your details

Forgot password? Click here to reset