Wasserstein Unsupervised Reinforcement Learning

10/15/2021
by   Shuncheng He, et al.
0

Unsupervised reinforcement learning aims to train agents to learn a handful of policies or skills in environments without external reward. These pre-trained policies can accelerate learning when endowed with external reward, and can also be used as primitive options in hierarchical reinforcement learning. Conventional approaches of unsupervised skill discovery feed a latent variable to the agent and shed its empowerment on agent's behavior by mutual information (MI) maximization. However, the policies learned by MI-based methods cannot sufficiently explore the state space, despite they can be successfully identified from each other. Therefore we propose a new framework Wasserstein unsupervised reinforcement learning (WURL) where we directly maximize the distance of state distributions induced by different policies. Additionally, we overcome difficulties in simultaneously training N(N >2) policies, and amortizing the overall reward to each step. Experiments show policies learned by our approach outperform MI-based methods on the metric of Wasserstein distance while keeping high discriminability. Furthermore, the agents trained by WURL can sufficiently explore the state space in mazes and MuJoCo tasks and the pre-trained policies can be applied to downstream tasks by hierarchical learning.

READ FULL TEXT

page 15

page 16

page 17

research
10/06/2021

The Information Geometry of Unsupervised Reinforcement Learning

How can a reinforcement learning (RL) agent prepare to solve downstream ...
research
08/04/2022

Impact Makes a Sound and Sound Makes an Impact: Sound Guides Representations and Explorations

Sound is one of the most informative and abundant modalities in the real...
research
10/27/2021

Direct then Diffuse: Incremental Unsupervised Skill Discovery for State Covering and Goal Reaching

Learning meaningful behaviors in the absence of reward is a difficult pr...
research
06/07/2020

Skill Discovery of Coordination in Multi-agent Reinforcement Learning

Unsupervised skill discovery drives intelligent agents to explore the un...
research
10/20/2021

Hierarchical Skills for Efficient Exploration

In reinforcement learning, pre-trained low-level skills have the potenti...
research
10/13/2022

A Mixture of Surprises for Unsupervised Reinforcement Learning

Unsupervised reinforcement learning aims at learning a generalist policy...
research
12/24/2020

Mesh Based Analysis of Low Fractal Dimension ReinforcementLearning Policies

In previous work, using a process we call meshing, the reachable state s...

Please sign up or login with your details

Forgot password? Click here to reset