Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games

09/10/2018
by   Gabriele Farina, et al.
0

Regret minimization is a powerful tool for solving large-scale extensive-form games. State-of-the-art methods rely on minimizing regret locally at each decision point. In this work we derive a new framework for regret minimization on sequential decision problems and extensive-form games with general compact convex sets at each decision point and general convex losses, as opposed to prior work which has been for simplex decision points and linear losses. We call our framework laminar regret decomposition. It generalizes the CFR algorithm to this more general setting. Furthermore, our framework enables a new proof of CFR even in the known setting, which is derived from a perspective of decomposing polytope regret, thereby leading to an arguably simpler interpretation of the algorithm. Our generalization to convex compact sets and convex losses allows us to develop new algorithms for several problems: regularized sequential decision making, regularized Nash equilibria in extensive-form games, and computing approximate extensive-form perfect equilibria. Our generalization also leads to the first regret-minimization algorithm for computing reduced-normal-form quantal response equilibria based on minimizing local regrets. Experiments show that our framework leads to algorithms that scale at a rate comparable to the fastest variants of counterfactual regret minimization for computing Nash equilibrium, and therefore our approach leads to the first algorithm for computing quantal response equilibria in extremely large games. Finally we show that our framework enables a new kind of scalable opponent exploitation approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/06/2018

Regret Circuits: Composability of Regret Minimizers

Regret minimization is a powerful tool for solving large-scale problems;...
research
02/19/2020

Stochastic Regret Minimization in Extensive-Form Games

Monte-Carlo counterfactual regret minimization (MCCFR) is the state-of-t...
research
03/08/2021

Model-Free Online Learning in Unknown Sequential Decision Making Problems and Games

Regret minimization has proved to be a versatile tool for tree-form sequ...
research
11/06/2018

Composability of Regret Minimizers

Regret minimization is a powerful tool for solving large-scale problems;...
research
05/27/2021

Conic Blackwell Algorithm: Parameter-Free Convex-Concave Saddle-Point Solving

We develop new parameter and scale-free algorithms for solving convex-co...
research
02/24/2022

Solving optimization problems with Blackwell approachability

We introduce the Conic Blackwell Algorithm^+ (CBA^+) regret minimizer, a...
research
10/11/2021

Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent

Counterfactual Regret Minimization (CFR) is a kind of regret minimizatio...

Please sign up or login with your details

Forgot password? Click here to reset