Log In Sign Up

ERMAS: Becoming Robust to Reward Function Sim-to-Real Gaps in Multi-Agent Simulations

by   Eric Zhao, et al.

Multi-agent simulations provide a scalable environment for learning policies that interact with rational agents. However, such policies may fail to generalize to the real-world where agents may differ from simulated counterparts due to unmodeled irrationality and misspecified reward functions. We introduce Epsilon-Robust Multi-Agent Simulation (ERMAS), a robust optimization framework for learning AI policies that are robust to such multiagent sim-to-real gaps. While existing notions of multi-agent robustness concern perturbations in the actions of agents, we address a novel robustness objective concerning perturbations in the reward functions of agents. ERMAS provides this robustness by anticipating suboptimal behaviors from other agents, formalized as the worst-case epsilon-equilibrium. We show empirically that ERMAS yields robust policies for repeated bimatrix games and optimal taxation problems in economic simulations. In particular, in the two-level RL problem posed by the AI Economist (Zheng et al., 2020) ERMAS learns tax policies that are robust to changes in agent risk aversion, improving social welfare by up to 15


page 1

page 2

page 3

page 4


RPM: Generalizable Behaviors for Multi-Agent Reinforcement Learning

Despite the recent advancement in multi-agent reinforcement learning (MA...

Learning to Incentivize Other Learning Agents

The challenge of developing powerful and general Reinforcement Learning ...

A Fast Algorithm for Robust Action Selection in Multi-Agent Systems

In this paper, we consider a robust action selection problem in multi-ag...

A Formal Solution to the Grain of Truth Problem

A Bayesian agent acting in a multi-agent environment learns to predict t...

Robust Risk-Sensitive Reinforcement Learning Agents for Trading Markets

Trading markets represent a real-world financial application to deploy r...

The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies

Tackling real-world socio-economic challenges requires designing and tes...

Evaluating the Robustness of Collaborative Agents

In order for agents trained by deep reinforcement learning to work along...