Socially-Optimal Mechanism Design for Incentivized Online Learning

12/29/2021
by   Zhiyuan Wang, et al.
0

Multi-arm bandit (MAB) is a classic online learning framework that studies the sequential decision-making in an uncertain environment. The MAB framework, however, overlooks the scenario where the decision-maker cannot take actions (e.g., pulling arms) directly. It is a practically important scenario in many applications such as spectrum sharing, crowdsensing, and edge computing. In these applications, the decision-maker would incentivize other selfish agents to carry out desired actions (i.e., pulling arms on the decision-maker's behalf). This paper establishes the incentivized online learning (IOL) framework for this scenario. The key challenge to design the IOL framework lies in the tight coupling of the unknown environment learning and asymmetric information revelation. To address this, we construct a special Lagrangian function based on which we propose a socially-optimal mechanism for the IOL framework. Our mechanism satisfies various desirable properties such as agent fairness, incentive compatibility, and voluntary participation. It achieves the same asymptotic performance as the state-of-art benchmark that requires extra information. Our analysis also unveils the power of crowd in the IOL framework: a larger agent crowd enables our mechanism to approach more closely the theoretical upper bound of social performance. Numerical results demonstrate the advantages of our mechanism in large-scale edge computing.

READ FULL TEXT
research
07/07/2023

Online Network Source Optimization with Graph-Kernel MAB

We propose Grab-UCB, a graph-kernel multi-arms bandit algorithm to learn...
research
06/14/2019

Permissioned Blockchain for Efficient and Secure Resource Sharing in Vehicular Edge Computing

With the fast expanding scale of vehicular networks, vehicular edge comp...
research
02/11/2018

Nearly Optimal Adaptive Procedure for Piecewise-Stationary Bandit: a Change-Point Detection Approach

Multi-armed bandit (MAB) is a class of online learning problems where a ...
research
01/23/2022

Distributed Bandits with Heterogeneous Agents

This paper tackles a multi-agent bandit setting where M agents cooperate...
research
06/28/2023

Allocating Divisible Resources on Arms with Unknown and Random Rewards

We consider a decision maker allocating one unit of renewable and divisi...
research
10/21/2020

Coordinated Online Learning for Multi-Agent Systems with Coupled Constraints and Perturbed Utility Observations

Competitive non-cooperative online decision-making agents whose actions ...
research
07/09/2019

The Secretary Recommendation Problem

In this paper we revisit the basic variant of the classical secretary pr...

Please sign up or login with your details

Forgot password? Click here to reset