Is Learning in Games Good for the Learners?

05/31/2023
by   William Brown, et al.
0

We consider a number of questions related to tradeoffs between reward and regret in repeated gameplay between two agents. To facilitate this, we introduce a notion of generalized equilibrium which allows for asymmetric regret constraints, and yields polytopes of feasible values for each agent and pair of regret constraints, where we show that any such equilibrium is reachable by a pair of algorithms which maintain their regret guarantees against arbitrary opponents. As a central example, we highlight the case one agent is no-swap and the other's regret is unconstrained. We show that this captures an extension of Stackelberg equilibria with a matching optimal value, and that there exists a wide class of games where a player can significantly increase their utility by deviating from a no-swap-regret algorithm against a no-swap learner (in fact, almost any game without pure Nash equilibria is of this form). Additionally, we make use of generalized equilibria to consider tradeoffs in terms of the opponent's algorithm choice. We give a tight characterization for the maximal reward obtainable against some no-regret learner, yet we also show a class of games in which this is bounded away from the value obtainable against the class of common “mean-based” no-regret algorithms. Finally, we consider the question of learning reward-optimal strategies via repeated play with a no-regret agent when the game is initially unknown. Again we show tradeoffs depending on the opponent's learning algorithm: the Stackelberg strategy is learnable in exponential time with any no-regret agent (and in polynomial time with any no-adaptive-regret agent) for any game where it is learnable via queries, and there are games where it is learnable in polynomial time against any no-swap-regret agent but requires exponential time against a mean-based no-regret agent.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2019

Strategizing against No-regret Learners

How should a player who repeatedly plays a game against a no-regret lear...
research
12/14/2021

How and Why to Manipulate Your Own Agent

We consider strategic settings where several users engage in a repeated ...
research
07/03/2003

BL-WoLF: A Framework For Loss-Bounded Learnability In Zero-Sum Games

We present BL-WoLF, a framework for learnability in repeated zero-sum ga...
research
05/17/2022

Strategizing against Learners in Bayesian Games

We study repeated two-player games where one of the players, the learner...
research
02/20/2020

Distributed No-Regret Learning in Multi-Agent Systems

In this tutorial article, we give an overview of new challenges and repr...
research
11/09/2018

Policy Regret in Repeated Games

The notion of policy regret in online learning is a well defined? perfor...
research
12/20/2021

Balancing Adaptability and Non-exploitability in Repeated Games

We study the problem of guaranteeing low regret in repeated games agains...

Please sign up or login with your details

Forgot password? Click here to reset