DeepAI AI Chat
Log In Sign Up

Hypothesis-Driven Skill Discovery for Hierarchical Deep Reinforcement Learning

by   Caleb Chuck, et al.
The University of Texas at Austin

Deep reinforcement learning encompasses many versatile tools for designing learning agents that can perform well on a variety of high-dimensional visual tasks, ranging from video games to robotic manipulation. However, these methods typically suffer from poor sample efficiency, partially because they strive to be largely problem-agnostic. In this work, we demonstrate the utility of a different approach that is extremely sample efficient, but limited to object-centric tasks that (approximately) obey basic physical laws. Specifically, we propose the Hypothesis Proposal and Evaluation (HyPE) algorithm, which utilizes a small set of intuitive assumptions about the behavior of objects in the physical world (or in games that mimic physics) to automatically define and learn hierarchical skills in a highly efficient manner. HyPE does this by discovering objects from raw pixel data, generating hypotheses about the controllability of observed changes in object state, and learning a hierarchy of skills that can test these hypotheses and control increasingly complex interactions with objects. We demonstrate that HyPE can dramatically improve sample efficiency when learning a high-quality pixels-to-actions policy; in the popular benchmark task, Breakout, HyPE learns an order of magnitude faster than common baseline reinforcement learning and evolutionary strategies for policy learning.


page 2

page 6

page 12


Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates

Reinforcement learning holds the promise of enabling autonomous robots t...

Deep Reinforcement Learning for the Control of Robotic Manipulation: A Focussed Mini-Review

Deep learning has provided new ways of manipulating, processing and anal...

Unsupervised Skill-Discovery and Skill-Learning in Minecraft

Pre-training Reinforcement Learning agents in a task-agnostic manner has...

Learning Object Manipulation Skills from Video via Approximate Differentiable Physics

We aim to teach robots to perform simple object manipulation tasks by wa...

DisTop: Discovering a Topological representation to learn diverse and rewarding skills

The optimal way for a deep reinforcement learning (DRL) agent to explore...

Discovering Object-Centric Generalized Value Functions From Pixels

Deep Reinforcement Learning has shown significant progress in extracting...

Exchangeable Input Representations for Reinforcement Learning

Poor sample efficiency is a major limitation of deep reinforcement learn...