LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of Iterative Tasks

07/10/2021
by   Albert Wilcox, et al.
21

Reinforcement learning (RL) algorithms have shown impressive success in exploring high-dimensional environments to learn complex, long-horizon tasks, but can often exhibit unsafe behaviors and require extensive environment interaction when exploration is unconstrained. A promising strategy for safe learning in dynamically uncertain environments is requiring that the agent can robustly return to states where task success (and therefore safety) can be guaranteed. While this approach has been successful in low-dimensions, enforcing this constraint in environments with high-dimensional state spaces, such as images, is challenging. We present Latent Space Safe Sets (LS3), which extends this strategy to iterative, long-horizon tasks with image observations by using suboptimal demonstrations and a learned dynamics model to restrict exploration to the neighborhood of a learned Safe Set where task completion is likely. We evaluate LS3 on 4 domains, including a challenging sequential pushing task in simulation and a physical cable routing task. We find that LS3 can use prior task successes to restrict exploration and learn more efficiently than prior algorithms while satisfying constraints. See https://tinyurl.com/latent-ss for code and supplementary material.

READ FULL TEXT

page 2

page 6

page 16

research
10/29/2020

Recovery RL: Safe Reinforcement Learning with Learned Recovery Zones

Safety remains a central obstacle preventing widespread use of RL in the...
research
10/13/2020

Broadly-Exploring, Local-Policy Trees for Long-Horizon Task Planning

Long-horizon planning in realistic environments requires the ability to ...
research
06/24/2021

Model-Based Reinforcement Learning via Latent-Space Collocation

The ability to plan into the future while utilizing only raw high-dimens...
research
12/03/2019

Dream to Control: Learning Behaviors by Latent Imagination

Learned world models summarize an agent's experience to facilitate learn...
research
12/07/2021

MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance

Safe exploration is critical for using reinforcement learning (RL) in ri...
research
04/21/2023

Approximate Shielding of Atari Agents for Safe Exploration

Balancing exploration and conservatism in the constrained setting is an ...
research
06/15/2020

Analytic Manifold Learning: Unifying and Evaluating Representations for Continuous Control

We address the problem of learning reusable state representations from s...

Please sign up or login with your details

Forgot password? Click here to reset