Learning in Cournot Games with Limited Information Feedback
In this work, we study the interaction of strategic players in continuous action Cournot games with limited information feedback. Cournot game is the essential market model for many socio-economic systems where players learn and compete without the full knowledge of the system or each other. In this setting, it becomes important to understand the dynamics and limiting behavior of the players. We consider concave Cournot games and two widely used learning strategies in the form of no-regret algorithms and policy gradient. We prove that if the players all adopt one of the two algorithms, their joint action converges to the unique Nash equilibrium of the game. Notably, we show that myopic algorithms such as policy gradient can achieve exponential convergence rate, while no-regret algorithms only converge (sub)linearly. Together, our work presents significantly sharper convergence results and shows how exploiting the structure of the game can lead to much faster convergence rates.
READ FULL TEXT