Deep Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching

01/24/2019
by   Tom Zahavy, et al.
0

We study the neural-linear bandit model for solving sequential decision-making problems with high dimensional side information. Neural-linear bandits leverage the representation power of deep neural networks and combine it with efficient exploration mechanisms, designed for linear contextual bandits, on top of the last hidden layer. Since the representation is being optimized during learning, information regarding exploration with "old" features is lost. Here, we propose the first limited memory neural-linear bandit that is resilient to this phenomenon, which we term catastrophic forgetting. We evaluate our method on a variety of real-world data sets, including regression, classification, and sentiment analysis, and observe that our algorithm is resilient to catastrophic forgetting and achieves superior performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/07/2021

Online Limited Memory Neural-Linear Bandits with Likelihood Matching

We study neural-linear bandits for solving problems where both explorati...
research
12/01/2021

Efficient Online Bayesian Inference for Neural Bandits

In this paper we present a new algorithm for online (sequential) inferen...
research
08/07/2017

Measuring Catastrophic Forgetting in Neural Networks

Deep neural networks are used in many state-of-the-art systems for machi...
research
06/10/2020

Gaussian Gated Linear Networks

We propose the Gaussian Gated Linear Network (G-GLN), an extension to th...
research
01/07/2020

Intrinsic Motivation and Episodic Memories for Robot Exploration of High-Dimensional Sensory Spaces

This work presents an architecture that generates curiosity-driven goal-...
research
07/01/2021

Markov Decision Process modeled with Bandits for Sequential Decision Making in Linear-flow

In membership/subscriber acquisition and retention, we sometimes need to...
research
11/02/2020

Self-Concordant Analysis of Generalized Linear Bandits with Forgetting

Contextual sequential decision problems with categorical or numerical ob...

Please sign up or login with your details

Forgot password? Click here to reset