Adapting to game trees in zero-sum imperfect information games

12/23/2022
by   Côme Fiegel, et al.
0

Imperfect information games (IIG) are games in which each player only partially observes the current game state. We study how to learn ϵ-optimal strategies in a zero-sum IIG through self-play with trajectory feedback. We give a problem-independent lower bound 𝒪(H(A_𝒳+B_𝒴)/ϵ^2) on the required number of realizations to learn these strategies with high probability, where H is the length of the game, A_𝒳 and B_𝒴 are the total number of actions for the two players. We also propose two Follow the Regularize leader (FTRL) algorithms for this setting: Balanced-FTRL which matches this lower bound, but requires the knowledge of the information set structure beforehand to define the regularization; and Adaptive-FTRL which needs 𝒪(H^2(A_𝒳+B_𝒴)/ϵ^2) plays without this requirement by progressively adapting the regularization to the observations.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro