ODPP: A Unified Algorithm Framework for Unsupervised Option Discovery based on Determinantal Point Process

12/01/2022
by   Jiayu Chen, et al.
0

Learning rich skills through temporal abstractions without supervision of external rewards is at the frontier of Reinforcement Learning research. Existing works mainly fall into two distinctive categories: variational and Laplacian-based option discovery. The former maximizes the diversity of the discovered options through a mutual information loss but overlooks coverage of the state space, while the latter focuses on improving the coverage of options by increasing connectivity during exploration, but does not consider diversity. In this paper, we propose a unified framework that quantifies diversity and coverage through a novel use of the Determinantal Point Process (DPP) and enables unsupervised option discovery explicitly optimizing both objectives. Specifically, we define the DPP kernel matrix with the Laplacian spectrum of the state transition graph and use the expected mode number in the trajectories as the objective to capture and enhance both diversity and coverage of the learned options. The proposed option discovery algorithm is extensively evaluated using challenging tasks built with Mujoco and Atari, demonstrating that our proposed algorithm substantially outperforms SOTA baselines from both diversity- and coverage-driven categories. The codes are available at https://github.com/LucasCJYSDL/ODPP.

READ FULL TEXT

page 1

page 2

page 7

page 8

page 9

research
02/07/2022

Reward-Respecting Subtasks for Model-Based Reinforcement Learning

To achieve the ambitious goals of artificial intelligence, reinforcement...
research
01/20/2022

Multi-agent Covering Option Discovery based on Kronecker Product of Factor Graphs

Covering option discovery has been developed to improve the exploration ...
research
03/12/2020

Option Discovery in the Absence of Rewards with Manifold Analysis

Options have been shown to be an effective tool in reinforcement learnin...
research
07/21/2023

Scalable Multi-agent Covering Option Discovery based on Kronecker Graphs

Covering skill (a.k.a., option) discovery has been developed to improve ...
research
06/27/2021

Unsupervised Skill Discovery with Bottleneck Option Learning

Having the ability to acquire inherent skills from environments without ...
research
07/12/2021

Towards Better Laplacian Representation in Reinforcement Learning with Generalized Graph Drawing

The Laplacian representation recently gains increasing attention for rei...
research
07/26/2018

Variational Option Discovery Algorithms

We explore methods for option discovery based on variational inference a...

Please sign up or login with your details

Forgot password? Click here to reset