Diversity is All You Need: Learning Skills without a Reward Function

02/16/2018
by   Benjamin Eysenbach, et al.
0

Intelligent creatures can explore their environments and learn useful skills without supervision. In this paper, we propose DIAYN ("Diversity is All You Need"), a method for learning useful skills without a reward function. Our proposed method learns skills by maximizing an information theoretic objective using a maximum entropy policy. On a variety of simulated robotic tasks, we show that this simple objective results in the unsupervised emergence of diverse skills, such as walking and jumping. In a number of reinforcement learning benchmark environments, our method is able to learn a skill that solves the benchmark task despite never receiving the true task reward. In these environments, some of the learned skills correspond to solving the task, and each skill that solves the task does so in a distinct manner. Our results suggest that unsupervised discovery of skills can serve as an effective pretraining mechanism for overcoming challenges of exploration and data efficiency in reinforcement learning

READ FULL TEXT

page 4

page 5

page 6

page 7

page 14

page 15

research
04/27/2020

Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning

Reinforcement learning provides a general framework for learning robotic...
research
06/26/2021

Discovering Generalizable Skills via Automated Generation of Diverse Tasks

The learning efficiency and generalization ability of an intelligent age...
research
02/16/2022

Open-Ended Reinforcement Learning with Neural Reward Functions

Inspired by the great success of unsupervised learning in Computer Visio...
research
09/17/2021

Is Curiosity All You Need? On the Utility of Emergent Behaviours from Curious Exploration

Curiosity-based reward schemes can present powerful exploration mechanis...
research
02/10/2020

Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills

Acquiring abilities in the absence of a task-oriented reward function is...
research
08/24/2023

APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT

We study diverse skill discovery in reward-free environments, aiming to ...
research
01/04/2019

Machine Teaching in Hierarchical Genetic Reinforcement Learning: Curriculum Design of Reward Functions for Swarm Shepherding

The design of reward functions in reinforcement learning is a human skil...

Please sign up or login with your details

Forgot password? Click here to reset