DeepAI AI Chat
Log In Sign Up

ODPP: A Unified Algorithm Framework for Unsupervised Option Discovery based on Determinantal Point Process

12/01/2022
by   Jiayu Chen, et al.
0

Learning rich skills through temporal abstractions without supervision of external rewards is at the frontier of Reinforcement Learning research. Existing works mainly fall into two distinctive categories: variational and Laplacian-based option discovery. The former maximizes the diversity of the discovered options through a mutual information loss but overlooks coverage of the state space, while the latter focuses on improving the coverage of options by increasing connectivity during exploration, but does not consider diversity. In this paper, we propose a unified framework that quantifies diversity and coverage through a novel use of the Determinantal Point Process (DPP) and enables unsupervised option discovery explicitly optimizing both objectives. Specifically, we define the DPP kernel matrix with the Laplacian spectrum of the state transition graph and use the expected mode number in the trajectories as the objective to capture and enhance both diversity and coverage of the learned options. The proposed option discovery algorithm is extensively evaluated using challenging tasks built with Mujoco and Atari, demonstrating that our proposed algorithm substantially outperforms SOTA baselines from both diversity- and coverage-driven categories. The codes are available at https://github.com/LucasCJYSDL/ODPP.

READ FULL TEXT

page 1

page 2

page 7

page 8

page 9

02/07/2022

Reward-Respecting Subtasks for Model-Based Reinforcement Learning

To achieve the ambitious goals of artificial intelligence, reinforcement...
01/20/2022

Multi-agent Covering Option Discovery based on Kronecker Product of Factor Graphs

Covering option discovery has been developed to improve the exploration ...
03/12/2020

Option Discovery in the Absence of Rewards with Manifold Analysis

Options have been shown to be an effective tool in reinforcement learnin...
06/27/2021

Unsupervised Skill Discovery with Bottleneck Option Learning

Having the ability to acquire inherent skills from environments without ...
07/12/2021

Towards Better Laplacian Representation in Reinforcement Learning with Generalized Graph Drawing

The Laplacian representation recently gains increasing attention for rei...
07/26/2018

Variational Option Discovery Algorithms

We explore methods for option discovery based on variational inference a...
11/01/2019

PODNet: A Neural Network for Discovery of Plannable Options

Learning from demonstration has been widely studied in machine learning ...