Fully Gap-Dependent Bounds for Multinomial Logit Bandit

11/19/2020
by   Jiaqi Yang, et al.
0

We study the multinomial logit (MNL) bandit problem, where at each time step, the seller offers an assortment of size at most K from a pool of N items, and the buyer purchases an item from the assortment according to a MNL choice model. The objective is to learn the model parameters and maximize the expected revenue. We present (i) an algorithm that identifies the optimal assortment S^* within O(∑_i = 1^N Δ_i^-2) time steps with high probability, and (ii) an algorithm that incurs O(∑_i ∉ S^* KΔ_i^-1log T) regret in T time steps. To our knowledge, our algorithms are the first to achieve gap-dependent bounds that fully depends on the suboptimality gaps of all items. Our technical contributions include an algorithmic framework that relates the MNL-bandit problem to a variant of the top-K arm identification problem in multi-armed bandits, a generalized epoch-based offering procedure, and a layer-based adaptive estimation procedure.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2020

Improved Optimistic Algorithm For The Multinomial Logit Contextual Bandit

We consider a dynamic assortment selection problem where the goal is to ...
research
06/14/2022

Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds

This paper considers the multi-armed bandit (MAB) problem and provides a...
research
10/31/2019

Recovering Bandits

We study the recovering bandits problem, a variant of the stochastic mul...
research
03/06/2020

Contextual Blocking Bandits

We study a novel variant of the multi-armed bandit problem, where at eac...
research
11/15/2022

On Penalization in Stochastic Multi-armed Bandits

We study an important variant of the stochastic multi-armed bandit (MAB)...
research
10/02/2018

Thompson Sampling for Cascading Bandits

We design and analyze TS-Cascade, a Thompson sampling algorithm for the ...
research
09/19/2021

Generalized Translation and Scale Invariant Online Algorithm for Adversarial Multi-Armed Bandits

We study the adversarial multi-armed bandit problem and create a complet...

Please sign up or login with your details

Forgot password? Click here to reset