RL^3: Boosting Meta Reinforcement Learning via RL inside RL^2

06/28/2023
by   Abhinav Bhatia, et al.
0

Meta reinforcement learning (meta-RL) methods such as RL^2 have emerged as promising approaches for learning data-efficient RL algorithms tailored to a given task distribution. However, these RL algorithms struggle with long-horizon tasks and out-of-distribution tasks since they rely on recurrent neural networks to process the sequence of experiences instead of summarizing them into general RL components such as value functions. Moreover, even transformers have a practical limit to the length of histories they can efficiently reason about before training and inference costs become prohibitive. In contrast, traditional RL algorithms are data-inefficient since they do not leverage domain knowledge, but they do converge to an optimal policy as more data becomes available. In this paper, we propose RL^3, a principled hybrid approach that combines traditional RL and meta-RL by incorporating task-specific action-values learned through traditional RL as an input to the meta-RL neural network. We show that RL^3 earns greater cumulative reward on long-horizon and out-of-distribution tasks compared to RL^2, while maintaining the efficiency of the latter in the short term. Experiments are conducted on both custom and benchmark discrete domains from the meta-RL literature that exhibit a range of short-term, long-term, and complex dependencies.

READ FULL TEXT
research
07/19/2022

Learning Action Translator for Meta Reinforcement Learning on Sparse-Reward Tasks

Meta reinforcement learning (meta-RL) aims to learn a policy solving a s...
research
09/28/2021

Deep Reinforcement Learning with Adjustments

Deep reinforcement learning (RL) algorithms can learn complex policies t...
research
01/19/2023

A Survey of Meta-Reinforcement Learning

While deep reinforcement learning (RL) has fueled multiple high-profile ...
research
02/04/2021

Alchemy: A structured task distribution for meta-reinforcement learning

There has been rapidly growing interest in meta-learning as a method for...
research
06/03/2022

Challenges to Solving Combinatorially Hard Long-Horizon Deep RL Tasks

Deep reinforcement learning has shown promise in discrete domains requir...
research
01/13/2023

From Ember to Blaze: Swift Interactive Video Adaptation via Meta-Reinforcement Learning

Maximizing quality of experience (QoE) for interactive video streaming h...
research
09/05/2017

Knowledge Sharing for Reinforcement Learning: Writing a BOOK

This paper proposes a novel deep reinforcement learning (RL) method inte...

Please sign up or login with your details

Forgot password? Click here to reset