Near-Optimal Adversarial Policy Switching for Decentralized Asynchronous Multi-Agent Systems

by   Trong Nghia Hoang, et al.
Northeastern University
Princeton University

A key challenge in multi-robot and multi-agent systems is generating solutions that are robust to other self-interested or even adversarial parties who actively try to prevent the agents from achieving their goals. The practicality of existing works addressing this challenge is limited to only small-scale synchronous decision-making scenarios or a single agent planning its best response against a single adversary with fixed, procedurally characterized strategies. In contrast this paper considers a more realistic class of problems where a team of asynchronous agents with limited observation and communication capabilities need to compete against multiple strategic adversaries with changing strategies. This problem necessitates agents that can coordinate to detect changes in adversary strategies and plan the best response accordingly. Our approach first optimizes a set of stratagems that represent these best responses. These optimized stratagems are then integrated into a unified policy that can detect and respond when the adversaries change their strategies. The near-optimality of the proposed framework is established theoretically as well as demonstrated empirically in simulation and hardware.


page 5

page 8


Multi-agent Policy Optimization with Approximatively Synchronous Advantage Estimation

Cooperative multi-agent tasks require agents to deduce their own contrib...

LFC: Combining Autonomous Agents and Automated Planning in the Multi-Agent Programming Contest

The 2019 Multi-Agent Programming Contest introduced a new scenario, Agen...

Strategic Abilities of Asynchronous Agents: Semantic Paradoxes and How to Tame Them

Recently, we proposed a framework for verification of agents' abilities ...

Multi-Agent Adversarial Attacks for Multi-Channel Communications

Recently Reinforcement Learning (RL) has been applied as an anti-adversa...

GUT: A General Cooperative Multi-Agent Hierarchical Decision Architecture in Adversarial Environments

Adversarial Robotics is a burgeoning research area in Swarms and Multi-A...

Stability of defection, optimisation of strategies and the limits of memory in the Prisoner's Dilemma

Memory-one strategies are a set of Iterated Prisoner's Dilemma strategie...

Please sign up or login with your details

Forgot password? Click here to reset