Active Search for High Recall: a Non-Stationary Extension of Thompson Sampling

12/27/2017
by   Jean-Michel Renders, et al.
0

We consider the problem of Active Search, where a maximum of relevant objects - ideally all relevant objects - should be retrieved with the minimum effort or minimum time. Typically, there are two main challenges to face when tackling this problem: first, the class of relevant objects has often low prevalence and, secondly, this class can be multi-faceted or multi-modal: objects could be relevant for completely different reasons. To solve this problem and its associated issues, we propose an approach based on a non-stationary (aka restless) extension of Thompson Sampling, a well-known strategy for Multi-Armed Bandits problems. The collection is first soft-clustered into a finite set of components and a posterior distribution of getting a relevant object inside each cluster is updated after receiving the user feedback about the proposed instances. The "next instance" selection strategy is a mixed, two-level decision process, where both the soft clusters and their instances are considered. This method can be considered as an insurance, where the cost of the insurance is an extra exploration effort in the short run, for achieving a nearly "total" recall with less efforts in the long run.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/28/2021

Risk averse non-stationary multi-armed bandits

This paper tackles the risk averse multi-armed bandits problem when incu...
research
07/09/2020

Recurrent Neural-Linear Posterior Sampling for Non-Stationary Contextual Bandits

An agent in a non-stationary contextual bandit problem should balance be...
research
02/22/2019

Multi-Armed Bandit Strategies for Non-Stationary Reward Distributions and Delayed Feedback Processes

A survey is performed of various Multi-Armed Bandit (MAB) strategies in ...
research
09/18/2023

Task Selection and Assignment for Multi-modal Multi-task Dialogue Act Classification with Non-stationary Multi-armed Bandits

Multi-task learning (MTL) aims to improve the performance of a primary t...
research
06/30/2023

Thompson sampling for improved exploration in GFlowNets

Generative flow networks (GFlowNets) are amortized variational inference...
research
03/29/2017

Bandit-Based Model Selection for Deformable Object Manipulation

We present a novel approach to deformable object manipulation that does ...
research
01/02/2020

Multi-Armed Bandits for Decentralized AP selection in Enterprise WLANs

WiFi densification leads to the existence of multiple overlapping covera...

Please sign up or login with your details

Forgot password? Click here to reset