Online Algorithm for Unsupervised Sequential Selection with Contextual Information

10/23/2020
by   Arun Verma, et al.
0

In this paper, we study Contextual Unsupervised Sequential Selection (USS), a new variant of the stochastic contextual bandits problem where the loss of an arm cannot be inferred from the observed feedback. In our setup, arms are associated with fixed costs and are ordered, forming a cascade. In each round, a context is presented, and the learner selects the arms sequentially till some depth. The total cost incurred by stopping at an arm is the sum of fixed costs of arms selected and the stochastic loss associated with the arm. The learner's goal is to learn a decision rule that maps contexts to arms with the goal of minimizing the total expected loss. The problem is challenging as we are faced with an unsupervised setting as the total loss cannot be estimated. Clearly, learning is feasible only if the optimal arm can be inferred (explicitly or implicitly) from the problem structure. We observe that learning is still possible when the problem instance satisfies the so-called 'Contextual Weak Dominance' (CWD) property. Under CWD, we propose an algorithm for the contextual USS problem and demonstrate that it has sub-linear regret. Experiments on synthetic and real datasets validate our algorithm.

READ FULL TEXT
research
09/16/2020

Thompson Sampling for Unsupervised Sequential Selection

Thompson Sampling has generated significant interest due to its better e...
research
02/11/2020

Online Preselection with Context Information under the Plackett-Luce Model

We consider an extension of the contextual multi-armed bandit problem, i...
research
10/17/2016

Risk-Aware Algorithms for Adversarial Contextual Bandits

In this work we consider adversarial contextual bandits with risk constr...
research
12/22/2022

Sequential Decision Problems with Weak Feedback

This thesis considers sequential decision problems, where the loss/rewar...
research
12/22/2022

Synopsis: Sequential Decision Problems with Weak Feedback

This thesis considers sequential decision problems, where the loss/rewar...
research
01/15/2019

Online Algorithm for Unsupervised Sensor Selection

In many security and healthcare systems, the detection and diagnosis sys...
research
04/09/2018

Contextual Search via Intrinsic Volumes

We study the problem of contextual search, a multidimensional generaliza...

Please sign up or login with your details

Forgot password? Click here to reset