Episodic Logit-Q Dynamics for Efficient Learning in Stochastic Teams

09/06/2023
by   Onur Unlu, et al.
0

We present new learning dynamics combining (independent) log-linear learning and value iteration for stochastic games within the auxiliary stage game framework. The dynamics presented provably attain the efficient equilibrium (also known as optimal equilibrium) in identical-interest stochastic games, beyond the recent concentration of progress on provable convergence to some (possibly inefficient) equilibrium. The dynamics are also independent in the sense that agents take actions consistent with their local viewpoint to a reasonable extent rather than seeking equilibrium. These aspects can be of practical interest in the control applications of intelligent and autonomous systems. The key challenges are the convergence to an inefficient equilibrium and the non-stationarity of the environment from a single agent's viewpoint due to the adaptation of others. The log-linear update plays an important role in addressing the former. We address the latter through the play-in-episodes scheme in which the agents update their Q-function estimates only at the end of the episodes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2022

Logit-Q Learning in Markov Games

We present new independent learning dynamics provably converging to an e...
research
02/20/2023

Efficient-Q Learning for Stochastic Games

We present the new efficient-Q learning dynamics for stochastic games be...
research
10/08/2020

Fictitious play in zero-sum stochastic games

We present fictitious play dynamics for the general class of stochastic ...
research
11/23/2021

Independent Learning in Stochastic Games

Reinforcement learning (RL) has recently achieved tremendous successes i...
research
03/28/2018

Continuous-time integral dynamics for Aggregative Game equilibrium seeking

In this paper, we consider continuous-time semi-decentralized dynamics f...
research
02/27/2023

Equilibrium Bandits: Learning Optimal Equilibria of Unknown Dynamics

Consider a decision-maker that can pick one out of K actions to control ...
research
11/22/2022

Network coevolution drives segregation and enhances Pareto optimal equilibrium selection in coordination games

In this work we assess the role played by the dynamical adaptation of th...

Please sign up or login with your details

Forgot password? Click here to reset