DeepAI AI Chat
Log In Sign Up

Near-Optimal MNL Bandits Under Risk Criteria

09/26/2020
by   Guangyu Xi, et al.
0

We study MNL bandits, which is a variant of the traditional multi-armed bandit problem, under risk criteria. Unlike the ordinary expected revenue, risk criteria are more general goals widely used in industries and bussiness. We design algorithms for a broad class of risk criteria, including but not limited to the well-known conditional value-at-risk, Sharpe ratio and entropy risk, and prove that they suffer a near-optimal regret. As a complement, we also conduct experiments with both synthetic and real data to show the empirical performance of our proposed algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

06/04/2018

A General Approach to Multi-Armed Bandits Under Risk Criteria

Different risk-related criteria have received recent interest in learnin...
04/12/2019

Distributed Bandit Learning: How Much Communication is Needed to Achieve (Near) Optimal Regret

We study the communication complexity of distributed multi-armed bandits...
08/15/2021

Batched Thompson Sampling for Multi-Armed Bandits

We study Thompson Sampling algorithms for stochastic multi-armed bandits...
12/14/2015

Fighting Bandits with a New Kind of Smoothness

We define a novel family of algorithms for the adversarial multi-armed b...
11/16/2020

Risk-Constrained Thompson Sampling for CVaR Bandits

The multi-armed bandit (MAB) problem is a ubiquitous decision-making pro...
08/25/2021

A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits

This paper unifies the design and simplifies the analysis of risk-averse...
08/09/2021

Whittle Index for A Class of Restless Bandits with Imperfect Observations

We consider a class of restless bandit problems that finds a broad appli...