Steering control of payoff-maximizing players in adaptive learning dynamics

by   Xingru Chen, et al.

Evolutionary game theory provides a mathematical foundation for cross-disciplinary fertilization, especially for integrating ideas from artificial intelligence and game theory. Such integration offers a transparent and rigorous approach to complex decision-making problems in a variety of important contexts, ranging from evolutionary computation to machine behavior. Despite the astronomically huge individual behavioral strategy space for interactions in the iterated Prisoner's Dilemma (IPD) games, the so-called Zero-Determinant (ZD) strategies is a set of rather simple memory-one strategies yet can unilaterally set a linear payoff relationship between themselves and their opponent. Although the witting of ZD strategies gives players an upper hand in the IPD games, we find and characterize unbending strategies that can force ZD players to be fair in their own interest. Moreover, our analysis reveals the ubiquity of unbending properties in common IPD strategies which are previously overlooked. In this work, we demonstrate the important steering role of unbending strategies in fostering fairness and cooperation in pairwise interactions. Our results will help bring a new perspective by means of combining game theory and multi-agent learning systems for optimizing winning strategies that are robust to noises, errors, and deceptions in non-zero-sum games.


page 1

page 2

page 3

page 4


Evolutionarily Stable Sets in Quantum Penny Flip Games

In game theory, an Evolutionarily Stable Set (ES set) is a set of Nash E...

Poincaré-Bendixson Limit Sets in Multi-Agent Learning

A key challenge of evolutionary game theory and multi-agent learning is ...

Surprising strategies obtained by stochastic optimization in partially observable games

This paper studies the optimization of strategies in the context of poss...

Stability of Multi-Agent Learning: Convergence in Network Games with Many Players

The behaviour of multi-agent learning in many player games has been show...

Stability of defection, optimisation of strategies and the limits of memory in the Prisoner's Dilemma

Memory-one strategies are a set of Iterated Prisoner's Dilemma strategie...

A meta analysis of tournaments and an evaluation of performance in the Iterated Prisoner's Dilemma

The Iterated Prisoner's Dilemma has been used for decades as a model of ...

Reward-Mediated Individual and Altruistic Behavior

Recent research has taken particular interest in observing the dynamics ...

Please sign up or login with your details

Forgot password? Click here to reset