Stochastic Rank-1 Bandits

08/10/2016
by   Sumeet Katariya, et al.
0

We propose stochastic rank-1 bandits, a class of online learning problems where at each step a learning agent chooses a pair of row and column arms, and receives the product of their values as a reward. The main challenge of the problem is that the individual values of the row and column are unobserved. We assume that these values are stochastic and drawn independently. We propose a computationally-efficient algorithm for solving our problem, which we call Rank1Elim. We derive a O((K + L) (1 / Δ) n) upper bound on its n-step regret, where K is the number of rows, L is the number of columns, and Δ is the minimum of the row and column gaps; under the assumption that the mean row and column rewards are bounded away from zero. To the best of our knowledge, we present the first bandit algorithm that finds the maximum entry of a rank-1 matrix whose regret is linear in K + L, 1 / Δ, and n. We also derive a nearly matching lower bound. Finally, we evaluate Rank1Elim empirically on multiple problems. We observe that it leverages the structure of our problems and can learn near-optimal solutions even if our modeling assumptions are mildly violated.

READ FULL TEXT
research
12/13/2017

Stochastic Low-Rank Bandits

Many problems in computer vision and recommender systems involve low-ran...
research
03/19/2017

Bernoulli Rank-1 Bandits for Click Feedback

The probability that a user will click a search result depends both on i...
research
06/22/2023

Logarithmic Regret for Matrix Games against an Adversary with Noisy Bandit Feedback

This paper considers a variant of zero-sum matrix games where at each ti...
research
02/09/2016

DCM Bandits: Learning to Rank with Multiple Clicks

A search engine recommends to the user a list of web pages. The user exa...
research
03/07/2017

Online Learning to Rank in Stochastic Click Models

Online learning to rank is a core problem in information retrieval and m...
research
01/21/2019

Distributed Nesterov gradient methods over arbitrary graphs

In this letter, we introduce a distributed Nesterov method, termed as AB...
research
10/28/2019

The Multi-level Bottleneck Assignment Problem: Complexity and Solution Methods

We study the multi-level bottleneck assignment problem (MBA), which has ...

Please sign up or login with your details

Forgot password? Click here to reset