Newton-based Policy Optimization for Games

07/15/2020
by   Giorgia Ramponi, et al.
0

Many learning problems involve multiple agents optimizing different interactive functions. In these problems, the standard policy gradient algorithms fail due to the non-stationarity of the setting and the different interests of each agent. In fact, algorithms must take into account the complex dynamics of these systems to guarantee rapid convergence towards a (local) Nash equilibrium. In this paper, we propose NOHD (Newton Optimization on Helmholtz Decomposition), a Newton-like algorithm for multi-agent learning problems based on the decomposition of the dynamics of the system in its irrotational (Potential) and solenoidal (Hamiltonian) component. This method ensures quadratic convergence in purely irrotational systems and pure solenoidal systems. Furthermore, we show that NOHD is attracted to stable fixed points in general multi-agent systems and repelled by strict saddle ones. Finally, we empirically compare the NOHD's performance with that of state-of-the-art algorithms on some bimatrix games and in a continuous Gridworld environment.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/08/2019

Policy-Gradient Algorithms Have No Guarantees of Convergence in Continuous Action and State Multi-Agent Settings

We show by counterexample that policy-gradient algorithms have no guaran...
research
09/14/2020

Multi-Agent Reinforcement Learning in Cournot Games

In this work, we study the interaction of strategic agents in continuous...
research
10/23/2022

Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence

Multi-agent interactions are increasingly important in the context of re...
research
06/03/2021

Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games

Potential games are arguably one of the most important and widely studie...
research
12/24/2021

Lyapunov Exponents for Diversity in Differentiable Games

Ridge Rider (RR) is an algorithm for finding diverse solutions to optimi...
research
05/30/2019

Convergence Analysis of Gradient-Based Learning with Non-Uniform Learning Rates in Non-Cooperative Multi-Agent Settings

Considering a class of gradient-based multi-agent learning algorithms in...
research
11/20/2018

Stable Opponent Shaping in Differentiable Games

A growing number of learning methods are actually games which optimise m...

Please sign up or login with your details

Forgot password? Click here to reset