You Only Live Once: Single-Life Reinforcement Learning

10/17/2022
by   Annie S. Chen, et al.
3

Reinforcement learning algorithms are typically designed to learn a performant policy that can repeatedly and autonomously complete a task, usually starting from scratch. However, in many real-world situations, the goal might not be to learn a policy that can do the task repeatedly, but simply to perform a new task successfully once in a single trial. For example, imagine a disaster relief robot tasked with retrieving an item from a fallen building, where it cannot get direct supervision from humans. It must retrieve this object within one test-time trial, and must do so while tackling unknown obstacles, though it may leverage knowledge it has of the building before the disaster. We formalize this problem setting, which we call single-life reinforcement learning (SLRL), where an agent must complete a task within a single episode without interventions, utilizing its prior experience while contending with some form of novelty. SLRL provides a natural setting to study the challenge of autonomously adapting to unfamiliar situations, and we find that algorithms designed for standard episodic reinforcement learning often struggle to recover from out-of-distribution states in this setting. Motivated by this observation, we propose an algorithm, Q-weighted adversarial learning (QWALE), which employs a distribution matching strategy that leverages the agent's prior experience as guidance in novel situations. Our experiments on several single-life continuous control problems indicate that methods based on our distribution matching formulation are 20-60 more quickly recover from novel states.

READ FULL TEXT
research
05/11/2022

A State-Distribution Matching Approach to Non-Episodic Reinforcement Learning

While reinforcement learning (RL) provides a framework for learning thro...
research
04/05/2022

Automating Reinforcement Learning with Example-based Resets

Deep reinforcement learning has enabled robots to learn motor skills fro...
research
02/27/2019

Introspection Learning

Traditional reinforcement learning agents learn from experience, past or...
research
07/27/2021

Persistent Reinforcement Learning via Subgoal Curricula

Reinforcement learning (RL) promises to enable autonomous acquisition of...
research
05/25/2020

Policy Entropy for Out-of-Distribution Classification

One critical prerequisite for the deployment of reinforcement learning s...
research
01/06/2020

Learning Reusable Options for Multi-Task Reinforcement Learning

Reinforcement learning (RL) has become an increasingly active area of re...
research
10/19/2022

When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning

A long-term goal of reinforcement learning is to design agents that can ...

Please sign up or login with your details

Forgot password? Click here to reset