Randomized Value Functions via Multiplicative Normalizing Flows

06/06/2018
by   Ahmed Touati, et al.
0

Randomized value functions offer a promising approach towards the challenge of efficient exploration in complex environments with high dimensional state and action spaces. Unlike traditional point estimate methods, randomized value functions maintain a posterior distribution over action-space values. This prevents the agent's behavior policy from prematurely exploiting early estimates and falling into local optima. In this work, we leverage recent advances in variational Bayesian neural networks and combine these with traditional Deep Q-Networks (DQN) to achieve randomized value functions for high-dimensional domains. In particular, we augment DQN with multiplicative normalizing flows in order to track an approximate posterior distribution over its parameters. This allows the agent to perform approximate Thompson sampling in a computationally efficient manner via stochastic gradient methods. We demonstrate the benefits of our approach through an empirical comparison in high dimensional environments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/07/2021

State-Aware Variational Thompson Sampling for Deep Q-Networks

Thompson sampling is a well-known approach for balancing exploration and...
research
06/08/2020

Randomized Policy Learning for Continuous State and Action MDPs

Deep reinforcement learning methods have achieved state-of-the-art resul...
research
03/06/2017

Multiplicative Normalizing Flows for Variational Bayesian Neural Networks

We reinterpret multiplicative noise in neural networks as auxiliary rand...
research
12/29/2021

DeepHAM: A Global Solution Method for Heterogeneous Agent Models with Aggregate Shocks

We propose an efficient, reliable, and interpretable global solution met...
research
02/15/2016

Deep Exploration via Bootstrapped DQN

Efficient exploration in complex environments remains a major challenge ...
research
09/02/2019

Randomized methods to characterize large-scale vortical flow network

We demonstrate the effective use of randomized methods for linear algebr...
research
03/14/2021

A Scalable Gradient-Free Method for Bayesian Experimental Design with Implicit Models

Bayesian experimental design (BED) is to answer the question that how to...

Please sign up or login with your details

Forgot password? Click here to reset