ViZDoom: DRQN with Prioritized Experience Replay, Double-Q Learning, & Snapshot Ensembling

01/03/2018
by   Christopher Schulze, et al.
0

ViZDoom is a robust, first-person shooter reinforcement learning environment, characterized by a significant degree of latent state information. In this paper, double-Q learning and prioritized experience replay methods are tested under a certain ViZDoom combat scenario using a competitive deep recurrent Q-network (DRQN) architecture. In addition, an ensembling technique known as snapshot ensembling is employed using a specific annealed learning rate to observe differences in ensembling efficacy under these two methods. Annealed learning rates are important in general to the training of deep neural network models, as they shake up the status-quo and counter a model's tending towards local optima. While both variants show performance exceeding those of built-in AI agents of the game, the known stabilizing effects of double-Q learning are illustrated, and priority experience replay is again validated in its usefulness by showing immediate results early on in agent development, with the caveat that value overestimation is accelerated in this case. In addition, some unique behaviors are observed to develop for priority experience replay (PER) and double-Q (DDQ) variants, and snapshot ensembling of both PER and DDQ proves a valuable method for improving performance of the ViZDoom Marine.

READ FULL TEXT

page 1

page 5

page 6

research
07/08/2020

Double Prioritized State Recycled Experience Replay

Experience replay enables online reinforcement learning agents to store ...
research
10/19/2019

Reverse Experience Replay

This paper describes an improvement in Deep Q-learning called Reverse Ex...
research
03/02/2018

Distributed Prioritized Experience Replay

We propose a distributed architecture for deep reinforcement learning at...
research
11/24/2022

Double Deep Q-Learning in Opponent Modeling

Multi-agent systems in which secondary agents with conflicting agendas a...
research
05/18/2019

Combining Experience Replay with Exploration by Random Network Distillation

Our work is a simple extension of the paper "Exploration by Random Netwo...
research
02/18/2021

Understanding algorithmic collusion with experience replay

In an infinitely repeated pricing game, pricing algorithms based on arti...
research
12/17/2018

Double Deep Q-Learning for Optimal Execution

Optimal trade execution is an important problem faced by essentially all...

Please sign up or login with your details

Forgot password? Click here to reset