Satisficing Paths and Independent Multi-Agent Reinforcement Learning in Stochastic Games
In multi-agent reinforcement learning (MARL), independent learners are those that do not access the action selections of other agents in the system. Due to the decentralization of information, it is generally difficult to design independent learners that drive the system to equilibrium. This paper investigates the feasibility of using satisficing dynamics to guide independent learners to approximate equilibrium policies in non-episodic, discounted stochastic games. Satisficing refers to halting search in an optimization problem upon finding a satisfactory but possibly suboptimal input; here, we take it to mean halting search upon finding an input that achieves a cost within ϵ of the minimum cost. In this paper, we define a useful structural concept for games, to be termed the ϵ-satisficing paths property, and we prove that this property holds for any ϵ≥ 0 in general two-player stochastic games and in N-player stochastic games satisfying a symmetry condition. To illustrate the utility of this property, we present an independent learning algorithm that comes with high probability guarantees of approximate equilibrium in N-player symmetric games. This guarantee is made assuming symmetry alone, without additional assumptions such as a zero sum, team, or potential game structure.
READ FULL TEXT