Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

10/02/2020
by   Lanqing Li, et al.
1

We study the offline meta-reinforcement learning (OMRL) problem, a paradigm which enables reinforcement learning (RL) algorithms to quickly adapt to unseen tasks without any interactions with the environments, making RL truly practical in many real-world applications. This problem is still not fully understood, for which two major challenges need to be addressed. First, offline RL often suffers from bootstrapping errors of out-of-distribution state-actions which leads to divergence of value functions. Second, meta-RL requires efficient and robust task inference learned jointly with control policy. In this work, we enforce behavior regularization on learned policy as a general approach to offline RL, combined with a deterministic context encoder for efficient task inference. We propose a novel negative-power distance metric on bounded context embedding space, whose gradients propagation is detached from that of the Bellman backup. We provide analysis and insight showing that some simple design choices can yield substantial improvements over recent approaches involving meta-RL and distance metric learning. To the best of our knowledge, our method is the first model-free and end-to-end OMRL algorithm, which is computationally efficient and demonstrated to outperform prior algorithms on several meta-RL benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/22/2021

Improved Context-Based Offline Meta-RL with Attention and Contrastive Learning

Meta-learning for offline reinforcement learning (OMRL) is an understudi...
research
02/07/2022

Model-Based Offline Meta-Reinforcement Learning with Regularization

Existing offline reinforcement learning (RL) methods face a few major ch...
research
06/21/2022

Robust Task Representations for Offline Meta-Reinforcement Learning via Contrastive Learning

We study offline meta-reinforcement learning, a practical reinforcement ...
research
10/13/2020

Balancing Constraints and Rewards with Meta-Gradient D4PG

Deploying Reinforcement Learning (RL) agents to solve real-world applica...
research
01/26/2023

Train Hard, Fight Easy: Robust Meta Reinforcement Learning

A major challenge of reinforcement learning (RL) in real-world applicati...
research
06/21/2022

Meta Reinforcement Learning with Finite Training Tasks – a Density Estimation Approach

In meta reinforcement learning (meta RL), an agent learns from a set of ...
research
12/31/2021

Single-Shot Pruning for Offline Reinforcement Learning

Deep Reinforcement Learning (RL) is a powerful framework for solving com...

Please sign up or login with your details

Forgot password? Click here to reset