Efficient Φ-Regret Minimization in Extensive-Form Games via Online Mirror Descent

by   Yu Bai, et al.

A conceptually appealing approach for learning Extensive-Form Games (EFGs) is to convert them to Normal-Form Games (NFGs). This approach enables us to directly translate state-of-the-art techniques and analyses in NFGs to learning EFGs, but typically suffers from computational intractability due to the exponential blow-up of the game size introduced by the conversion. In this paper, we address this problem in natural and important setups for the Φ-Hedge algorithm – A generic algorithm capable of learning a large class of equilibria for NFGs. We show that Φ-Hedge can be directly used to learn Nash Equilibria (zero-sum settings), Normal-Form Coarse Correlated Equilibria (NFCCE), and Extensive-Form Correlated Equilibria (EFCE) in EFGs. We prove that, in those settings, the Φ-Hedge algorithms are equivalent to standard Online Mirror Descent (OMD) algorithms for EFGs with suitable dilated regularizers, and run in polynomial time. This new connection further allows us to design and analyze a new class of OMD algorithms based on modifying its log-partition function. In particular, we design an improved algorithm with balancing techniques that achieves a sharp 𝒪(√(XAT)) EFCE-regret under bandit-feedback in an EFG with X information sets, A actions, and T episodes. To our best knowledge, this is the first such rate and matches the information-theoretic lower bound.


page 1

page 2

page 3

page 4


Near-Optimal Learning of Extensive-Form Games with Imperfect Information

This paper resolves the open question of designing near-optimal algorith...

Polynomial-Time Linear-Swap Regret Minimization in Imperfect-Information Sequential Games

No-regret learners seek to minimize the difference between the loss they...

V-Learning – A Simple, Efficient, Decentralized Algorithm for Multiagent RL

A major challenge of multiagent reinforcement learning (MARL) is the cur...

Faster No-Regret Learning Dynamics for Extensive-Form Correlated and Coarse Correlated Equilibria

A recent emerging trend in the literature on learning in games has been ...

Steering No-Regret Learners to Optimal Equilibria

We consider the problem of steering no-regret-learning agents to play de...

Equilibrium Finding in Normal-Form Games Via Greedy Regret Minimization

We extend the classic regret minimization framework for approximating eq...

No-Regret Learning in Games is Turing Complete

Games are natural models for multi-agent machine learning settings, such...

Please sign up or login with your details

Forgot password? Click here to reset