Nearest Neighbour with Bandit Feedback

06/23/2023

∙

In this paper we adapt the nearest neighbour rule to the contextual bandit problem. Our algorithm handles the fully adversarial setting in which no assumptions at all are made about the data-generation process. When combined with a sufficiently fast data-structure for (perhaps approximate) adaptive nearest neighbour search, such as a navigating net, our algorithm is extremely efficient - having a per trial running time polylogarithmic in both the number of trials and actions, and taking only quasi-linear space.

READ FULL TEXT

Nearest Neighbour with Bandit Feedback

Sign in with Google

Consider DeepAI Pro