Does Zero-Shot Reinforcement Learning Exist?

09/29/2022
by   Ahmed Touati, et al.
6

A zero-shot RL agent is an agent that can solve any RL task in a given environment, instantly with no additional planning or learning, after an initial reward-free learning phase. This marks a shift from the reward-centric RL paradigm towards "controllable" agents that can follow arbitrary instructions in an environment. Current RL agents can solve families of related tasks at best, or require planning anew for each task. Strategies for approximate zero-shot RL ave been suggested using successor features (SFs) [BBQ+ 18] or forward-backward (FB) representations [TO21], but testing has been limited. After clarifying the relationships between these schemes, we introduce improved losses and new SF models, and test the viability of zero-shot RL schemes systematically on tasks from the Unsupervised RL benchmark [LYL+21]. To disentangle universal representation learning from exploration, we work in an offline setting and repeat the tests on several existing replay buffers. SFs appear to suffer from the choice of the elementary state features. SFs with Laplacian eigenfunctions do well, while SFs based on auto-encoders, inverse curiosity, transition models, low-rank transition matrix, contrastive learning, or diversity (APS), perform unconsistently. In contrast, FB representations jointly learn the elementary and successor features from a single, principled criterion. They perform best and consistently across the board, reaching 85 a zero-shot manner.

READ FULL TEXT

page 22

page 26

page 27

page 28

research
06/11/2020

Exploration by Maximizing Rényi Entropy for Zero-Shot Meta RL

Exploring the transition dynamics is essential to the success of reinfor...
research
11/29/2021

Improving Zero-shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions

Reinforcement learning (RL) agents are widely used for solving complex s...
research
02/24/2021

PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning

We study reinforcement learning (RL) with no-reward demonstrations, a se...
research
11/28/2022

Hypernetworks for Zero-shot Transfer in Reinforcement Learning

In this paper, hypernetworks are trained to generate behaviors across a ...
research
06/20/2017

Programmable Agents

We build deep RL agents that execute declarative programs expressed in f...
research
01/13/2020

Exploiting Language Instructions for Interpretable and Compositional Reinforcement Learning

In this work, we present an alternative approach to making an agent comp...
research
02/20/2021

How To Train Your HERON

In this paper we apply Deep Reinforcement Learning (Deep RL) and Domain ...

Please sign up or login with your details

Forgot password? Click here to reset