Decentralized Competing Bandits in Non-Stationary Matching Markets

05/31/2022
by   Avishek Ghosh, et al.
5

Understanding complex dynamics of two-sided online matching markets, where the demand-side agents compete to match with the supply-side (arms), has recently received substantial interest. To that end, in this paper, we introduce the framework of decentralized two-sided matching market under non stationary (dynamic) environments. We adhere to the serial dictatorship setting, where the demand-side agents have unknown and different preferences over the supply-side (arms), but the arms have fixed and known preference over the agents. We propose and analyze a decentralized and asynchronous learning algorithm, namely Decentralized Non-stationary Competing Bandits (), where the agents play (restrictive) successive elimination type learning algorithms to learn their preference over the arms. The complexity in understanding such a system stems from the fact that the competing bandits choose their actions in an asynchronous fashion, and the lower ranked agents only get to learn from a set of arms, not dominated by the higher ranked agents, which leads to forced exploration. With carefully defined complexity parameters, we characterize this forced exploration and obtain sub-linear (logarithmic) regret of . Furthermore, we validate our theoretical findings via experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2020

Dominate or Delete: Decentralized Competing Bandits with Uniform Valuation

We study regret minimization problems in a two-sided matching market whe...
research
10/21/2022

Competing Bandits in Time Varying Matching Markets

We study the problem of online learning in two-sided non-stationary matc...
research
03/12/2021

Beyond log^2(T) Regret for Decentralized Bandits in Matching Markets

We design decentralized algorithms for regret minimization in the two-si...
research
01/03/2022

Using Non-Stationary Bandits for Learning in Repeated Cournot Games with Non-Stationary Demand

Many past attempts at modeling repeated Cournot games assume that demand...
research
08/04/2022

Learning the Trading Algorithm in Simulated Markets with Non-stationary Continuum Bandits

The basic Multi-Armed Bandits (MABs) problem is trying to maximize the r...
research
02/13/2023

Converging to Stability in Two-Sided Bandits: The Case of Unknown Preferences on Both Sides of a Matching Market

We study the problem of repeated two-sided matching with uncertain prefe...
research
08/07/2023

Asynchronous Decentralized Q-Learning: Two Timescale Analysis By Persistence

Non-stationarity is a fundamental challenge in multi-agent reinforcement...

Please sign up or login with your details

Forgot password? Click here to reset