Blackbox Attacks on Reinforcement Learning Agents Using Approximated Temporal Information

09/06/2019
by   Yiren Zhao, et al.
0

Recent research on reinforcement learning has shown that trained agents are vulnerable to maliciously crafted adversarial samples. In this work, we show how adversarial samples against RL agents can be generalised from White-box and Grey-box attacks to a strong Black-box case, namely where the attacker has no knowledge of the agents and their training methods. We use sequence-to-sequence models to predict a single action or a sequence of future actions that a trained agent will make. Our approximation model, based on time-series information from the agent, successfully predicts agents' future actions with consistently above 80 Second, we find that although such adversarial samples are transferable, they do not outperform random Gaussian noise as a means of reducing the game scores of trained RL agents. This highlights a serious methodological deficiency in previous work on such agents; random jamming should have been taken as the baseline for evaluation. Third, we do find a novel use for adversarial samples in this context: they can be used to trigger a trained agent to misbehave after a specific delay. This appears to be a genuinely new type of attack; it potentially enables an attacker to use devices controlled by RL agents as time bombs.

READ FULL TEXT

page 2

page 6

page 7

research
10/09/2021

Provably Efficient Black-Box Action Poisoning Attacks Against Reinforcement Learning

Due to the broad range of applications of reinforcement learning (RL), u...
research
09/05/2022

White-Box Adversarial Policies in Deep Reinforcement Learning

Adversarial examples against AI systems pose both risks via malicious at...
research
10/18/2021

Improving Robustness of Reinforcement Learning for Power System Control with Adversarial Training

Due to the proliferation of renewable energy and its intrinsic intermitt...
research
03/08/2017

Tactics of Adversarial Attack on Deep Reinforcement Learning Agents

We introduce two tactics to attack agents trained by deep reinforcement ...
research
04/01/2023

Recover Triggered States: Protect Model Against Backdoor Attack in Reinforcement Learning

A backdoor attack allows a malicious user to manipulate the environment ...
research
02/25/2019

Adversarial Reinforcement Learning under Partial Observability in Software-Defined Networking

Recent studies have demonstrated that reinforcement learning (RL) agents...
research
12/29/2019

Learning Generalized Models by Interrogating Black-Box Autonomous Agents

This paper develops a new approach for estimating the internal model of ...

Please sign up or login with your details

Forgot password? Click here to reset