Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments

02/16/2021
by   Amin Rakhsha, et al.
0

We study black-box reward poisoning attacks against reinforcement learning (RL), in which an adversary aims to manipulate the rewards to mislead a sequence of RL agents with unknown algorithms to learn a nefarious policy in an environment unknown to the adversary a priori. That is, our attack makes minimum assumptions on the prior knowledge of the adversary: it has no initial knowledge of the environment or the learner, and neither does it observe the learner's internal mechanism except for its performed actions. We design a novel black-box attack, U2, that can provably achieve a near-matching performance to the state-of-the-art white-box attack, demonstrating the feasibility of reward poisoning even in the most challenging black-box setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/09/2021

Provably Efficient Black-Box Action Poisoning Attacks Against Reinforcement Learning

Due to the broad range of applications of reinforcement learning (RL), u...
research
05/18/2023

Black-Box Targeted Reward Poisoning Attack Against Online Deep Reinforcement Learning

We propose the first black-box targeted attack against online deep reinf...
research
09/05/2022

White-Box Adversarial Policies in Deep Reinforcement Learning

Adversarial examples against AI systems pose both risks via malicious at...
research
11/20/2022

Adversarial Cheap Talk

Adversarial attacks in reinforcement learning (RL) often assume highly-p...
research
05/21/2023

BertRLFuzzer: A BERT and Reinforcement Learning based Fuzzer

We present a novel tool BertRLFuzzer, a BERT and Reinforcement Learning ...
research
12/30/2021

Self Reward Design with Fine-grained Interpretability

Transparency and fairness issues in Deep Reinforcement Learning may stem...
research
11/10/2019

Minimalistic Attacks: How Little it Takes to Fool a Deep Reinforcement Learning Policy

Recent studies have revealed that neural network-based policies can be e...

Please sign up or login with your details

Forgot password? Click here to reset