Adaptive Incentive Design with Multi-Agent Meta-Gradient Reinforcement Learning
Critical sectors of human society are progressing toward the adoption of powerful artificial intelligence (AI) agents, which are trained individually on behalf of self-interested principals but deployed in a shared environment. Short of direct centralized regulation of AI, which is as difficult an issue as regulation of human actions, one must design institutional mechanisms that indirectly guide agents' behaviors to safeguard and improve social welfare in the shared environment. Our paper focuses on one important class of such mechanisms: the problem of adaptive incentive design, whereby a central planner intervenes on the payoffs of an agent population via incentives in order to optimize a system objective. To tackle this problem in high-dimensional environments whose dynamics may be unknown or too complex to model, we propose a model-free meta-gradient method to learn an adaptive incentive function in the context of multi-agent reinforcement learning. Via the principle of online cross-validation, the incentive designer explicitly accounts for its impact on agents' learning and, through them, the impact on future social welfare. Experiments on didactic benchmark problems show that the proposed method can induce selfish agents to learn near-optimal cooperative behavior and significantly outperform learning-oblivious baselines. When applied to a complex simulated economy, the proposed method finds tax policies that achieve better trade-off between economic productivity and equality than baselines, a result that we interpret via a detailed behavioral analysis.
READ FULL TEXT