Optimal Attacks on Reinforcement Learning Policies

07/31/2019
by   Alessio Russo, et al.
2

Control policies, trained using the Deep Reinforcement Learning, have been recently shown to be vulnerable to adversarial attacks introducing even very small perturbations to the policy input. The attacks proposed so far have been designed using heuristics, and build on existing adversarial example crafting techniques used to dupe classifiers in supervised learning. In contrast, this paper investigates the problem of devising optimal attacks, depending on a well-defined attacker's objective, e.g., to minimize the main agent average reward. When the policy and the system dynamics, as well as rewards, are known to the attacker, a scenario referred to as a white-box attack, designing optimal attacks amounts to solving a Markov Decision Process. For what we call black-box attacks, where neither the policy nor the system is known, optimal attacks can be trained using Reinforcement Learning techniques. Through numerical experiments, we demonstrate the efficiency of our attacks compared to existing attacks (usually based on Gradient methods). We further quantify the potential impact of attacks and establish its connection to the smoothness of the policy under attack. Smooth policies are naturally less prone to attacks (this explains why Lipschitz policies, with respect to the state, are more resilient). Finally, we show that from the main agent perspective, the system uncertainties and the attacker can be modeled as a Partially Observable Markov Decision Process. We actually demonstrate that using Reinforcement Learning techniques tailored to POMDP (e.g. using Recurrent Neural Networks) leads to more resilient policies.

READ FULL TEXT

page 3

page 4

page 5

page 6

page 9

page 11

page 14

page 16

research
02/08/2017

Adversarial Attacks on Neural Network Policies

Machine learning classifiers are known to be vulnerable to inputs malici...
research
09/05/2022

White-Box Adversarial Policies in Deep Reinforcement Learning

Adversarial examples against AI systems pose both risks via malicious at...
research
05/20/2022

Adversarial joint attacks on legged robots

We address adversarial attacks on the actuators at the joints of legged ...
research
06/14/2022

Defending Observation Attacks in Deep Reinforcement Learning via Detection and Denoising

Neural network policies trained using Deep Reinforcement Learning (DRL) ...
research
09/15/2021

Balancing detectability and performance of attacks on the control channel of Markov Decision Processes

We investigate the problem of designing optimal stealthy poisoning attac...
research
06/16/2021

How memory architecture affects performance and learning in simple POMDPs

Reinforcement learning is made much more complex when the agent's observ...
research
11/01/2021

RADAMS: Resilient and Adaptive Alert and Attention Management Strategy against Informational Denial-of-Service (IDoS) Attacks

Attacks exploiting human attentional vulnerability have posed severe thr...

Please sign up or login with your details

Forgot password? Click here to reset