DeepAI AI Chat
Log In Sign Up

Interpretable Option Discovery using Deep Q-Learning and Variational Autoencoders

10/03/2022
by   Per-Arne Andersen, et al.
Universitetet Agder
0

Deep Reinforcement Learning (RL) is unquestionably a robust framework to train autonomous agents in a wide variety of disciplines. However, traditional deep and shallow model-free RL algorithms suffer from low sample efficiency and inadequate generalization for sparse state spaces. The options framework with temporal abstractions is perhaps the most promising method to solve these problems, but it still has noticeable shortcomings. It only guarantees local convergence, and it is challenging to automate initiation and termination conditions, which in practice are commonly hand-crafted. Our proposal, the Deep Variational Q-Network (DVQN), combines deep generative- and reinforcement learning. The algorithm finds good policies from a Gaussian distributed latent-space, which is especially useful for defining options. The DVQN algorithm uses MSE with KL-divergence as regularization, combined with traditional Q-Learning updates. The algorithm learns a latent-space that represents good policies with state clusters for options. We show that the DVQN algorithm is a promising approach for identifying initiation and termination conditions for option-based reinforcement learning. Experiments show that the DVQN algorithm, with automatic initiation and termination, has comparable performance to Rainbow and can maintain stability when trained for extended periods after convergence.

READ FULL TEXT
12/06/2021

Flexible Option Learning

Temporal abstraction in reinforcement learning (RL), offers the promise ...
02/26/2019

The Termination Critic

In this work, we consider the problem of autonomously discovering behavi...
12/01/2018

Discovering hierarchies using Imitation Learning from hierarchy aware policies

Learning options that allow agents to exhibit temporally higher order be...
04/15/2019

Disentangling Options with Hellinger Distance Regularizer

In reinforcement learning (RL), temporal abstraction still remains as an...
11/10/2017

Learning with Options that Terminate Off-Policy

A temporally abstract action, or an option, is specified by a policy and...
07/15/2022

Outcome-Guided Counterfactuals for Reinforcement Learning Agents from a Jointly Trained Generative Latent Space

We present a novel generative method for producing unseen and plausible ...
07/14/2020

Efficient Online Estimation of Empowerment for Reinforcement Learning

Training artificial agents to acquire desired skills through model-free ...