Exploiting No-Regret Algorithms in System Design

07/22/2020
by   Le Cong Dinh, et al.
0

We investigate a repeated two-player zero-sum game setting where the column player is also a designer of the system, and has full control on the design of the payoff matrix. In addition, the row player uses a no-regret algorithm to efficiently learn how to adapt their strategy to the column player's behaviour over time in order to achieve good total payoff. The goal of the column player is to guide her opponent to pick a mixed strategy which is favourable for the system designer. Therefore, she needs to: (i) design an appropriate payoff matrix A whose unique minimax solution contains the desired mixed strategy of the row player; and (ii) strategically interact with the row player during a sequence of plays in order to guide her opponent to converge to that desired behaviour. To design such a payoff matrix, we propose a novel solution that provably has a unique minimax solution with the desired behaviour. We also investigate a relaxation of this problem where uniqueness is not required, but all the minimax solutions have the same mixed strategy for the row player. Finally, we propose a new game playing algorithm for the system designer and prove that it can guide the row player, who may play a stable no-regret algorithm, to converge to a minimax solution.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/26/2020

Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information

This paper considers repeated games in which one player has more informa...
research
06/22/2023

Logarithmic Regret for Matrix Games against an Adversary with Noisy Bandit Feedback

This paper considers a variant of zero-sum matrix games where at each ti...
research
09/30/2019

Strategizing against No-regret Learners

How should a player who repeatedly plays a game against a no-regret lear...
research
10/26/2017

A Fast Algorithm for Solving Henderson's Mixed Model Equation

This article investigates a fast and stable method to solve Henderson's ...
research
01/30/2022

No-Regret Learning in Time-Varying Zero-Sum Games

Learning from repeated play in a fixed two-player zero-sum game is a cla...
research
10/26/2021

Nonparametric Matrix Estimation with One-Sided Covariates

Consider the task of matrix estimation in which a dataset X ∈ℝ^n× m is o...
research
03/05/2018

Playing Stackelberg Opinion Optimization with Randomized Algorithms for Combinatorial Strategies

From a perspective of designing or engineering for opinion formation gam...

Please sign up or login with your details

Forgot password? Click here to reset