Achieving Better Regret against Strategic Adversaries

02/13/2023
by   Le Cong Dinh, et al.
0

We study online learning problems in which the learner has extra knowledge about the adversary's behaviour, i.e., in game-theoretic settings where opponents typically follow some no-external regret learning algorithms. Under this assumption, we propose two new online learning algorithms, Accurate Follow the Regularized Leader (AFTRL) and Prod-Best Response (Prod-BR), that intensively exploit this extra knowledge while maintaining the no-regret property in the worst-case scenario of having inaccurate extra information. Specifically, AFTRL achieves O(1) external regret or O(1) forward regret against no-external regret adversary in comparison with O(√(T)) dynamic regret of Prod-BR. To the best of our knowledge, our algorithm is the first to consider forward regret that achieves O(1) regret against strategic adversaries. When playing zero-sum games with Accurate Multiplicative Weights Update (AMWU), a special case of AFTRL, we achieve last round convergence to the Nash Equilibrium. We also provide numerical experiments to further support our theoretical results. In particular, we demonstrate that our methods achieve significantly better regret bounds and rate of last round convergence, compared to the state of the art (e.g., Multiplicative Weights Update (MWU) and its optimistic counterpart, OMWU).

READ FULL TEXT

page 34

page 36

research
03/13/2021

Online Double Oracle

Solving strategic games with huge action space is a critical yet under-e...
research
09/15/2019

Online k-means Clustering

We study the problem of online clustering where a clustering algorithm h...
research
02/19/2019

Online Learning with Continuous Variations: Dynamic Regret and Reductions

We study the dynamic regret of a new class of online learning problems, ...
research
11/09/2018

Policy Regret in Repeated Games

The notion of policy regret in online learning is a well defined? perfor...
research
07/06/2023

Multiplicative Updates for Online Convex Optimization over Symmetric Cones

We study online convex optimization where the possible actions are trace...
research
01/23/2021

Optimistic and Adaptive Lagrangian Hedging

In online learning an algorithm plays against an environment with losses...
research
07/08/2022

Online Learning in Supply-Chain Games

We study a repeated game between a supplier and a retailer who want to m...

Please sign up or login with your details

Forgot password? Click here to reset