Adaptive Representation Selection in Contextual Bandit with Unlabeled History

02/03/2018
by   Baihan Lin, et al.
0

We consider an extension of the contextual bandit setting, motivated by several practical applications, where an unlabeled history of contexts can become available for pre-training before the online decision-making begins. We propose an approach for improving the performance of contextual bandit in such setting, via adaptive, dynamic representation learning, which combines offline pre-training on unlabeled history of contexts with online selection and modification of embedding functions. Our experiments on a variety of datasets and in different nonstationary environments demonstrate clear advantages of our approach over the standard contextual bandit.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2020

Online Semi-Supervised Learning in Contextual Bandits with Episodic Reward

We considered a novel practical problem of online learning with episodic...
research
07/13/2020

Contextual Bandit with Missing Rewards

We consider a novel variant of the contextual bandit problem (i.e., the ...
research
01/23/2019

Meta-Learning for Contextual Bandit Exploration

We describe MELEE, a meta-learning algorithm for learning a good explora...
research
03/17/2021

Homomorphically Encrypted Linear Contextual Bandit

Contextual bandit is a general framework for online learning in sequenti...
research
11/13/2019

Context-aware Dynamic Assets Selection for Online Portfolio Selection based on Contextual Bandit

Online portfolio selection is a sequential decision-making problem in fi...
research
06/14/2021

Bandit Modeling of Map Selection in Counter-Strike: Global Offensive

Many esports use a pick and ban process to define the parameters of a ma...
research
02/04/2023

Reinforcement Learning with History-Dependent Dynamic Contexts

We introduce Dynamic Contextual Markov Decision Processes (DCMDPs), a no...

Please sign up or login with your details

Forgot password? Click here to reset