
Stochastic Potential Games
Computing the Nash equilibrium (NE) for Nplayer nonzerosum stochastic ...
read it

Policy Optimization for LinearQuadratic ZeroSum MeanField Type Games
In this paper, zerosum meanfield type games (ZSMFTG) with linear dynam...
read it

Reinforcement Learning in Nonzerosum Linear Quadratic Deep Structured Games: Global Convergence of Policy Optimization
We study modelbased and modelfree policy optimization in a class of no...
read it

Almost Optimal Algorithms for Twoplayer Markov Games with Linear Function Approximation
We study reinforcement learning for twoplayer zerosum Markov games wit...
read it

LinearQuadratic ZeroSum MeanField Type Games: Optimality Conditions and Policy Optimization
In this paper, zerosum meanfield type games (ZSMFTG) with linear dynam...
read it

Optimisation dans la détection de communautés recouvrantes et équilibre de Nash
Community detection in graphs has been the subject of many algorithms. R...
read it

NatureInspired Mateheuristic Algorithms: Success and New Challenges
Despite the increasing popularity of metaheuristics, many crucially impo...
read it
Performance Analysis of Trial and Error Algorithms
Modelfree decentralized optimizations and learning are receiving increasing attention from theoretical and practical perspectives. In particular, two fully decentralized learning algorithms, namely Trial and Error (TEL) and Optimal Dynamical Learning (ODL), are very appealing for a broad class of games. In fact, ODL has the property to spend a high proportion of time in an optimum state that maximizes the sum of utility of all players. And the TEL has the property to spend a high proportion of time in an optimum state that maximizes the sum of utility of all players if there is a Pure Nash Equilibrium (PNE), otherwise, it spends a high proportion of time in an optimum state that maximizes a tradeoff between the sum of utility of all players and a predefined stability function. On the other hand, estimating the mean fraction of time spent in the optimum state (as well as the mean time duration to reach it) is challenging due to the high complexity and dimension of the inherent Markov Chains. In this paper, under some specific system model, an evaluation of the above performance metrics is provided by proposing an approximation of the considered Markov chains, which allows overcoming the problem of high dimensionality. A comparison between the two algorithms is then performed which allows a better understanding of their performances.
READ FULL TEXT
Comments
There are no comments yet.