Dynamics-Aware Unsupervised Discovery of Skills

07/02/2019
by   Archit Sharma, et al.
0

Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model for the dynamics of the environment. A good model can potentially enable planning algorithms to generate a large variety of behaviors and solve diverse tasks. However, learning an accurate model for complex dynamical systems is difficult, and even then, the model might not generalize well outside the distribution of states on which it was trained. In this work, we combine model-based learning with model-free learning of primitives that make model-based planning easy. To that end, we aim to answer the question: how can we discover skills whose outcomes are easy to predict? We propose an unsupervised learning algorithm, Dynamics-Aware Discovery of Skills (DADS), which simultaneously discovers predictable behaviors and learns their dynamics. Our method can leverage continuous skill spaces, theoretically, allowing us to learn infinitely many behaviors even for high-dimensional state-spaces. We demonstrate that zero-shot planning in the learned latent space significantly outperforms standard MBRL and model-free goal-conditioned RL, can handle sparse-reward tasks, and substantially improves over prior hierarchical RL methods for unsupervised skill discovery.

READ FULL TEXT

page 1

page 9

research
07/15/2022

Skill-based Model-based Reinforcement Learning

Model-based reinforcement learning (RL) is a sample-efficient way of lea...
research
05/21/2023

Unsupervised Discovery of Continuous Skills on a Sphere

Recently, methods for learning diverse skills to generate various behavi...
research
10/23/2018

Learning Representations in Model-Free Hierarchical Reinforcement Learning

Common approaches to Reinforcement Learning (RL) are seriously challenge...
research
02/24/2023

Leveraging Jumpy Models for Planning and Fast Learning in Robotic Domains

In this paper we study the problem of learning multi-step dynamics predi...
research
12/04/2020

Planning from Pixels using Inverse Dynamics Models

Learning task-agnostic dynamics models in high-dimensional observation s...
research
09/07/2022

Concept-modulated model-based offline reinforcement learning for rapid generalization

The robustness of any machine learning solution is fundamentally bound b...
research
10/25/2021

Unsupervised Domain Adaptation with Dynamics-Aware Rewards in Reinforcement Learning

Unsupervised reinforcement learning aims to acquire skills without prior...

Please sign up or login with your details

Forgot password? Click here to reset