Nearest Neighbour with Bandit Feedback

06/23/2023
by   Stephen Pasteris, et al.
0

In this paper we adapt the nearest neighbour rule to the contextual bandit problem. Our algorithm handles the fully adversarial setting in which no assumptions at all are made about the data-generation process. When combined with a sufficiently fast data-structure for (perhaps approximate) adaptive nearest neighbour search, such as a navigating net, our algorithm is extremely efficient - having a per trial running time polylogarithmic in both the number of trials and actions, and taking only quasi-linear space.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset