Combining Tree-Search, Generative Models, and Nash Bargaining Concepts in Game-Theoretic Reinforcement Learning

02/01/2023
by   Zun Li, et al.
0

Multiagent reinforcement learning (MARL) has benefited significantly from population-based and game-theoretic training regimes. One approach, Policy-Space Response Oracles (PSRO), employs standard reinforcement learning to compute response policies via approximate best responses and combines them via meta-strategy selection. We augment PSRO by adding a novel search procedure with generative sampling of world states, and introduce two new meta-strategy solvers based on the Nash bargaining solution. We evaluate PSRO's ability to compute approximate Nash equilibrium, and its performance in two negotiation games: Colored Trails, and Deal or No Deal. We conduct behavioral studies where human participants negotiate with our agents (N = 346). We find that search with generative modeling finds stronger policies during both training time and test time, enables online Bayesian co-player prediction, and can produce agents that achieve comparable social welfare negotiating with humans as humans trading among themselves.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2017

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

To achieve general intelligence, agents must learn how to interact with ...
research
06/06/2022

Specification-Guided Learning of Nash Equilibria with High Social Welfare

Reinforcement learning has been shown to be an effective strategy for au...
research
05/27/2023

Reinforcement Learning With Reward Machines in Stochastic Games

We investigate multi-agent reinforcement learning for stochastic games w...
research
04/21/2021

Policy Fusion for Adaptive and Customizable Reinforcement Learning Agents

In this article we study the problem of training intelligent agents usin...
research
06/17/2020

Policy Evaluation and Seeking for Multi-Agent Reinforcement Learning via Best Response

This paper introduces two metrics (cycle-based and memory-based metrics)...
research
02/09/2023

Regularization for Strategy Exploration in Empirical Game-Theoretic Analysis

In iterative approaches to empirical game-theoretic analysis (EGTA), the...

Please sign up or login with your details

Forgot password? Click here to reset