Max-Utility Based Arm Selection Strategy For Sequential Query Recommendations

We consider the query recommendation problem in closed loop interactive learning settings like online information gathering and exploratory analytics. The problem can be naturally modelled using the Multi-Armed Bandits (MAB) framework with countably many arms. The standard MAB algorithms for countably many arms begin with selecting a random set of candidate arms and then applying standard MAB algorithms, e.g., UCB, on this candidate set downstream. We show that such a selection strategy often results in higher cumulative regret and to this end, we propose a selection strategy based on the maximum utility of the arms. We show that in tasks like online information gathering, where sequential query recommendations are employed, the sequences of queries are correlated and the number of potentially optimal queries can be reduced to a manageable size by selecting queries with maximum utility with respect to the currently executing query. Our experimental results using a recent real online literature discovery service log file demonstrate that the proposed arm selection strategy improves the cumulative regret substantially with respect to the state-of-the-art baseline algorithms. strategy for a variety of contextual multi-armed bandit algorithms. Our data model and source code are available at  <https://anonymous.4open.science/r/0e5ad6b7-ac02-4577-9212-c9d505d3dbdb/>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2022

Improving Sequential Query Recommendation with Immediate User Feedback

We propose an algorithm for next query recommendation in interactive dat...
research
02/11/2020

Online Preselection with Context Information under the Plackett-Luce Model

We consider an extension of the contextual multi-armed bandit problem, i...
research
10/18/2018

Exploiting Correlation in Finite-Armed Structured Bandits

We consider a correlated multi-armed bandit problem in which rewards of ...
research
05/22/2021

From Finite to Countable-Armed Bandits

We consider a stochastic bandit problem with countably many arms that be...
research
02/19/2020

Warm Starting Bandits with Side Information from Confounded Data

We study a variant of the multi-armed bandit problem where side informat...
research
07/08/2022

Interactive Recommendations for Optimal Allocations in Markets with Constraints

Recommendation systems when employed in markets play a dual role: they a...
research
10/23/2021

Towards the D-Optimal Online Experiment Design for Recommender Selection

Selecting the optimal recommender via online exploration-exploitation is...

Please sign up or login with your details

Forgot password? Click here to reset