IMAP: Intrinsically Motivated Adversarial Policy

05/04/2023
by   Xiang Zheng, et al.
0

Reinforcement learning (RL) agents are known to be vulnerable to evasion attacks during deployment. In single-agent environments, attackers can inject imperceptible perturbations on the policy or value network's inputs or outputs; in multi-agent environments, attackers can control an adversarial opponent to indirectly influence the victim's observation. Adversarial policies offer a promising solution to craft such attacks. Still, current approaches either require perfect or partial knowledge of the victim policy or suffer from sample inefficiency due to the sparsity of task-related rewards. To overcome these limitations, we propose the Intrinsically Motivated Adversarial Policy (IMAP) for efficient black-box evasion attacks in single- and multi-agent environments without any knowledge of the victim policy. IMAP uses four intrinsic objectives based on state coverage, policy coverage, risk, and policy divergence to encourage exploration and discover stronger attacking skills. We also design a novel Bias-Reduction (BR) method to boost IMAP further. Our experiments demonstrate the effectiveness of these intrinsic objectives and BR in improving adversarial policy learning in the black-box setting against multiple types of victim agents in various single- and multi-agent MuJoCo environments. Notably, our IMAP reduces the performance of the state-of-the-art robust WocaR-PPO agents by 34%-54% and achieves a SOTA attacking success rate of 83.91% in the two-player zero-sum game YouShallNotPass.

READ FULL TEXT

page 2

page 8

research
05/19/2022

Sparse Adversarial Attack in Multi-agent Reinforcement Learning

Cooperative multi-agent reinforcement learning (cMARL) has many real app...
research
07/12/2021

Explore and Control with Adversarial Surprise

Reinforcement learning (RL) provides a framework for learning goal-direc...
research
03/04/2022

AutoDIME: Automatic Design of Interesting Multi-Agent Environments

Designing a distribution of environments in which RL agents can learn in...
research
08/09/2021

Mis-spoke or mis-lead: Achieving Robustness in Multi-Agent Communicative Reinforcement Learning

Recent studies in multi-agent communicative reinforcement learning (MACR...
research
11/11/2022

Efficient Domain Coverage for Vehicles with Second Order Dynamics via Multi-Agent Reinforcement Learning

Collaborative autonomous multi-agent systems covering a specified area h...
research
04/04/2022

Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization

We present Reward-Switching Policy Optimization (RSPO), a paradigm to di...
research
05/10/2023

Robust multi-agent coordination via evolutionary generation of auxiliary adversarial attackers

Cooperative multi-agent reinforcement learning (CMARL) has shown to be p...

Please sign up or login with your details

Forgot password? Click here to reset