AutoML for Contextual Bandits

09/07/2019
by   Praneet Dutta, et al.
0

Contextual Bandits is one of the widely popular techniques used in applications such as personalization, recommendation systems, mobile health, causal marketing etc . As a dynamic approach, it can be more efficient than standard A/B testing in minimizing regret. We propose an end to end automated meta-learning pipeline to approximate the optimal Q function for contextual bandits problems. We see that our model is able to perform much better than random exploration, being more regret efficient and able to converge with a limited number of samples, while remaining very general and easy to use due to the meta-learning approach. We used a linearly annealed e-greedy exploration policy to define the exploration vs exploitation schedule. We tested the system on a synthetic environment to characterize it fully and we evaluated it on some open source datasets to benchmark against prior work. We see that our model outperforms or performs comparatively to other models while requiring no tuning nor feature engineering.

READ FULL TEXT
research
01/31/2022

Neural Collaborative Filtering Bandits via Meta Learning

Contextual multi-armed bandits provide powerful tools to solve the explo...
research
05/19/2020

Greedy Algorithm almost Dominates in Smoothed Contextual Bandits

Online learning algorithms, widely used to power search and content opti...
research
07/25/2018

Deep Contextual Multi-armed Bandits

Contextual multi-armed bandit problems arise frequently in important ind...
research
11/14/2019

Contextual Bandits Evolving Over Finite Time

Contextual bandits have the same exploration-exploitation trade-off as s...
research
02/11/2021

Meta-Thompson Sampling

Efficient exploration in multi-armed bandits is a fundamental online lea...
research
07/27/2020

Greedy Bandits with Sampled Context

Bayesian strategies for contextual bandits have proved promising in sing...
research
06/01/2018

The Externalities of Exploration and How Data Diversity Helps Exploitation

Online learning algorithms, widely used to power search and content opti...

Please sign up or login with your details

Forgot password? Click here to reset