Imitating Opponent to Win: Adversarial Policy Imitation Learning in Two-player Competitive Games

10/30/2022
by   The Viet Bui, et al.
0

Recent research on vulnerabilities of deep reinforcement learning (RL) has shown that adversarial policies adopted by an adversary agent can influence a target RL agent (victim agent) to perform poorly in a multi-agent environment. In existing studies, adversarial policies are directly trained based on experiences of interacting with the victim agent. There is a key shortcoming of this approach; knowledge derived from historical interactions may not be properly generalized to unexplored policy regions of the victim agent, making the trained adversarial policy significantly less effective. In this work, we design a new effective adversarial policy learning algorithm that overcomes this shortcoming. The core idea of our new algorithm is to create a new imitator to imitate the victim agent's policy while the adversarial policy will be trained not only based on interactions with the victim agent but also based on feedback from the imitator to forecast victim's intention. By doing so, we can leverage the capability of imitation learning in well capturing underlying characteristics of the victim policy only based on sample trajectories of the victim. Our victim imitation learning model differs from prior models as the environment's dynamics are driven by adversary's policy and will keep changing during the adversarial policy training. We provide a provable bound to guarantee a desired imitating policy when the adversary's policy becomes stable. We further strengthen our adversarial policy learning by making our imitator a stronger version of the victim. Finally, our extensive experiments using four competitive MuJoCo game environments show that our proposed adversarial policy learning algorithm outperforms state-of-the-art algorithms.

READ FULL TEXT

page 6

page 16

research
08/20/2023

Mimicking To Dominate: Imitation Learning Strategies for Success in Multiagent Competitive Games

Training agents in multi-agent competitive games presents significant ch...
research
01/04/2020

Multi-Agent Interactions Modeling with Correlated Policies

In multi-agent systems, complex interacting behaviors arise due to the h...
research
06/17/2018

Learning Policy Representations in Multiagent Systems

Modeling agent behavior is central to understanding the emergence of com...
research
11/28/2020

Human-Agent Cooperation in Bridge Bidding

We introduce a human-compatible reinforcement-learning approach to a coo...
research
11/18/2022

Provable Defense against Backdoor Policies in Reinforcement Learning

We propose a provable defense mechanism against backdoor policies in rei...
research
11/01/2020

Sample Efficient Training in Multi-Agent Adversarial Games with Limited Teammate Communication

We describe our solution approach for Pommerman TeamRadio, a competition...
research
09/17/2020

Evolutionary Selective Imitation: Interpretable Agents by Imitation Learning Without a Demonstrator

We propose a new method for training an agent via an evolutionary strate...

Please sign up or login with your details

Forgot password? Click here to reset