Sample Complexity of Incentivized Exploration

02/03/2020
by   Mark Sellke, et al.
0

We consider incentivized exploration: a version of multi-armed bandits where the choice of actions is controlled by self-interested agents, and the algorithm can only issue recommendations. The algorithm controls the flow of information, and the information asymmetry can incentivize the agents to explore. Prior work matches the optimal regret rates for bandits up to "constant" multiplicative factors determined by the Bayesian prior. However, the dependence on the prior in prior work could be arbitrarily large, and the dependence on the number of arms K could be exponential. The optimal dependence on the prior and K is very unclear. We make progress on these issues. Our first result is that Thompson sampling is incentive-compatible if initialized with enough data points. Thus, we reduce the problem of designing incentive-compatible algorithms to that of sample complexity: (i) How many data points are needed to incentivize Thompson sampling? (ii) How many rounds does it take to collect these samples? We address both questions, providing upper bounds on sample complexity that are typically polynomial in K and lower bounds that are polynomially matching.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2023

Incentivizing Exploration with Linear Contexts and Combinatorial Actions

We advance the study of incentivized bandit exploration, in which arm ch...
research
06/01/2022

Incentivizing Combinatorial Bandit Exploration

Consider a bandit algorithm that recommends actions to self-interested u...
research
10/14/2016

The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits

Stochastic linear bandits are a natural and simple generalisation of fin...
research
05/11/2021

Targeting Makes Sample Efficiency in Auction Design

This paper introduces the targeted sampling model in optimal auction des...
research
06/10/2015

On the Prior Sensitivity of Thompson Sampling

The empirically successful Thompson Sampling algorithm for stochastic ba...
research
10/08/2019

Credible Sample Elicitation by Deep Learning, for Deep Learning

It is important to collect credible training samples (x,y) for building ...
research
10/24/2018

Optimal Algorithm for Bayesian Incentive-Compatible

We consider a social planner faced with a stream of myopic selfish agent...

Please sign up or login with your details

Forgot password? Click here to reset