Hedging in games: Faster convergence of external and swap regrets

06/08/2020 ∙ by Xi Chen, et al. ∙ 0

We consider the setting where players run the Hedge algorithm or its optimistic variant <cit.> to play an n-action game repeatedly for T rounds. 1) For two-player games, we show that the regret of optimistic Hedge decays at Õ( 1/T ^5/6 ), improving the previous bound O(1/T^3/4) by <cit.>. 2) In contrast, we show that the convergence rate of vanilla Hedge is no better than Ω̃(1/ √(T)), addressing an open question posted in <cit.>. For general m-player games, we show that the swap regret of each player decays at rate Õ(m^1/2 (n/T)^3/4) when they combine optimistic Hedge with the classical external-to-internal reduction of Blum and Mansour <cit.>. The algorithm can also be modified to achieve the same rate against itself and a rate of Õ(√(n/T)) against adversaries. Via standard connections, our upper bounds also imply faster convergence to coarse correlated equilibria in two-player games and to correlated equilibria in multiplayer games.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.