Thompson Sampling for Bandit Learning in Matching Markets

04/26/2022
by   Fang Kong, et al.
0

The problem of two-sided matching markets has a wide range of real-world applications and has been extensively studied in the literature. A line of recent works have focused on the problem setting where the preferences of one-side market participants are unknown a priori and are learned by iteratively interacting with the other side of participants. All these works are based on explore-then-commit (ETC) and upper confidence bound (UCB) algorithms, two common strategies in multi-armed bandits (MAB). Thompson sampling (TS) is another popular approach, which attracts lots of attention due to its easier implementation and better empirical performances. In many problems, even when UCB and ETC-type algorithms have already been analyzed, researchers are still trying to study TS for its benefits. However, the convergence analysis of TS is much more challenging and remains open in many problem settings. In this paper, we provide the first regret analysis for TS in the new setting of iterative matching markets. Extensive experiments demonstrate the practical advantages of the TS-type algorithm over the ETC and UCB-type baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/20/2023

Player-optimal Stable Regret for Bandit Learning in Matching Markets

The problem of matching markets has been studied for a long time in the ...
research
06/12/2019

Competing Bandits in Matching Markets

Stable matching, a classical model for two-sided markets, has long been ...
research
01/24/2023

Double Matching Under Complementary Preferences

In this paper, we propose a new algorithm for addressing the problem of ...
research
02/13/2021

Multi-Stage Decentralized Matching Markets: Uncertain Preferences and Strategic Behaviors

Matching markets are often organized in a multi-stage and decentralized ...
research
05/07/2022

Rate-Optimal Contextual Online Matching Bandit

Two-sided online matching platforms have been employed in various market...
research
10/29/2020

Learning Strategies in Decentralized Matching Markets under Uncertain Preferences

We study two-sided decentralized matching markets in which participants ...
research
07/06/2018

Combinatorial Bandits for Incentivizing Agents with Dynamic Preferences

The design of personalized incentives or recommendations to improve user...

Please sign up or login with your details

Forgot password? Click here to reset