Approximate Function Evaluation via Multi-Armed Bandits

03/18/2022
by   Tavor Z. Baharav, et al.
8

We study the problem of estimating the value of a known smooth function f at an unknown point μ∈ℝ^n, where each component μ_i can be sampled via a noisy oracle. Sampling more frequently components of μ corresponding to directions of the function with larger directional derivatives is more sample-efficient. However, as μ is unknown, the optimal sampling frequencies are also unknown. We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least 1-δ returns an ϵ accurate estimate of f(μ). We generalize our algorithm to adapt to heteroskedastic noise, and prove asymptotic optimality when f is linear. We corroborate our theoretical results with numerical experiments, showing the dramatic gains afforded by adaptivity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/19/2021

Minimax Off-Policy Evaluation for Multi-Armed Bandits

We study the problem of off-policy evaluation in the multi-armed bandit ...
research
10/23/2021

Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits

Contextual multi-armed bandits are classical models in reinforcement lea...
research
11/03/2020

Greedy k-Center from Noisy Distance Samples

We study a variant of the canonical k-center problem over a set of verti...
research
10/14/2020

Asymptotic Randomised Control with applications to bandits

We consider a general multi-armed bandit problem with correlated (and si...
research
06/30/2015

Scalable Discrete Sampling as a Multi-Armed Bandit Problem

Drawing a sample from a discrete distribution is one of the building com...
research
08/14/2018

Adaptive Sampling for Convex Regression

In this paper, we introduce the first principled adaptive-sampling proce...

Please sign up or login with your details

Forgot password? Click here to reset