Amortized Variational Deep Q Network

11/03/2020
by   Haotian Zhang, et al.
0

Efficient exploration is one of the most important issues in deep reinforcement learning. To address this issue, recent methods consider the value function parameters as random variables, and resort variational inference to approximate the posterior of the parameters. In this paper, we propose an amortized variational inference framework to approximate the posterior distribution of the action value function in Deep Q Network. We establish the equivalence between the loss of the new model and the amortized variational inference loss. We realize the balance of exploration and exploitation by assuming the posterior as Cauchy and Gaussian, respectively in a two-stage training process. We show that the amortized framework can results in significant less learning parameters than existing state-of-the-art method. Experimental results on classical control tasks in OpenAI Gym and chain Markov Decision Process tasks show that the proposed method performs significantly better than state-of-art methods and requires much less training time.

READ FULL TEXT

page 7

page 12

research
11/30/2017

Variational Deep Q Network

We propose a framework that directly tackles the probability distributio...
research
06/06/2019

Amortized Inference of Variational Bounds for Learning Noisy-OR

Classical approaches for approximate inference depend on cleverly design...
research
04/21/2018

Variational Inference In Pachinko Allocation Machines

The Pachinko Allocation Machine (PAM) is a deep topic model that allows ...
research
02/07/2021

State-Aware Variational Thompson Sampling for Deep Q-Networks

Thompson sampling is a well-known approach for balancing exploration and...
research
12/12/2021

Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer

Bayesian neural networks (BNNs) have become a principal approach to alle...
research
02/28/2021

Automated Creative Optimization for E-Commerce Advertising

Advertising creatives are ubiquitous in E-commerce advertisements and ae...
research
05/14/2019

Moment-Based Variational Inference for Markov Jump Processes

We propose moment-based variational inference as a flexible framework fo...

Please sign up or login with your details

Forgot password? Click here to reset