Sequential Experimental Design for Transductive Linear Bandits

06/20/2019
by   Tanner Fiez, et al.
0

In this paper we introduce the transductive linear bandit problem: given a set of measurement vectors X⊂R^d, a set of items Z⊂R^d, a fixed confidence δ, and an unknown vector θ^∗∈R^d, the goal is to infer argmax_z∈Z z^θ^∗ with probability 1-δ by making as few sequentially chosen noisy measurements of the form x^θ^∗ as possible. When X=Z, this setting generalizes linear bandits, and when X is the standard basis vectors and Z⊂{0,1}^d, combinatorial bandits. Such a transductive setting naturally arises when the set of measurement vectors is limited due to factors such as availability or cost. As an example, in drug discovery the compounds and dosages X a practitioner may be willing to evaluate in the lab in vitro due to cost or safety reasons may differ vastly from those compounds and dosages Z that can be safely administered to patients in vivo. Alternatively, in recommender systems for books, the set of books X a user is queried about may be restricted to well known best-sellers even though the goal might be to recommend more esoteric titles Z. In this paper, we provide instance-dependent lower bounds for the transductive setting, an algorithm that matches these up to logarithmic factors, and an evaluation. In particular, we provide the first non-asymptotic algorithm for linear bandits that nearly achieves the information theoretic lower bound.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2021

Nearly Optimal Algorithms for Level Set Estimation

The level set estimation problem seeks to find all points in a domain X ...
research
06/13/2020

Explicit Best Arm Identification in Linear Bandits Using No-Regret Learners

We study the problem of best arm identification in linearly parameterise...
research
06/15/2023

Logarithmic Bayes Regret Bounds

We derive the first finite-time logarithmic regret bounds for Bayesian b...
research
03/29/2022

Nearly Minimax Algorithms for Linear Bandits with Shared Representation

We give novel algorithms for multi-task and lifelong linear bandits with...
research
09/01/2023

Interactive and Concentrated Differential Privacy for Bandits

Bandits play a crucial role in interactive learning schemes and modern r...
research
05/21/2021

Parallelizing Contextual Linear Bandits

Standard approaches to decision-making under uncertainty focus on sequen...
research
03/18/2021

Top-m identification for linear bandits

Motivated by an application to drug repurposing, we propose the first al...

Please sign up or login with your details

Forgot password? Click here to reset