Regret vs. Bandwidth Trade-off for Recommendation Systems

10/15/2018
by   Linqi Song, et al.
0

We consider recommendation systems that need to operate under wireless bandwidth constraints, measured as number of broadcast transmissions, and demonstrate a (tight for some instances) tradeoff between regret and bandwidth for two scenarios: the case of multi-armed bandit with context, and the case where there is a latent structure in the message space that we can exploit to reduce the learning phase.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2019

Phase Transitions and Cyclic Phenomena in Bandits with Switching Constraints

We consider the classical stochastic multi-armed bandit problem with a c...
research
01/26/2023

Collaborative Regret Minimization in Multi-Armed Bandits

In this paper, we study the collaborative learning model, which concerns...
research
01/31/2019

Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model

Contextual multi-armed bandit (MAB) algorithms have been shown promising...
research
02/17/2015

Regret bounds for Narendra-Shapiro bandit algorithms

Narendra-Shapiro (NS) algorithms are bandit-type algorithms that have be...
research
06/05/2019

Measurement-based Online Available Bandwidth Estimation employing Reinforcement Learning

An accurate and fast estimation of the available bandwidth in a network ...
research
07/10/2018

Bandits with Side Observations: Bounded vs. Logarithmic Regret

We consider the classical stochastic multi-armed bandit but where, from ...
research
07/31/2018

Graph-Based Recommendation System

In this work, we study recommendation systems modelled as contextual mul...

Please sign up or login with your details

Forgot password? Click here to reset