Reusable Options through Gradient-based Meta Learning

12/22/2022
by   David Kuric, et al.
0

Hierarchical methods in reinforcement learning have the potential to reduce the amount of decisions that the agent needs to perform when learning new tasks. However, finding a reusable useful temporal abstractions that facilitate fast learning remains a challenging problem. Recently, several deep learning approaches were proposed to learn such temporal abstractions in the form of options in an end-to-end manner. In this work, we point out several shortcomings of these methods and discuss their potential negative consequences. Subsequently, we formulate the desiderata for reusable options and use these to frame the problem of learning options as a gradient-based meta-learning problem. This allows us to formulate an objective that explicitly incentivizes options which allow a higher-level decision maker to adjust in few steps to different tasks. Experimentally, we show that our method is able to learn transferable components which accelerate learning and performs better than existing prior methods developed for this setting. Additionally, we perform ablations to quantify the impact of using gradient-based meta-learning as well as other proposed changes.

READ FULL TEXT

page 7

page 9

page 10

page 18

page 19

page 20

research
09/14/2017

When Waiting is not an Option : Learning Options with a Deliberation Cost

Recent work has shown that temporally extended actions (options) can be ...
research
04/13/2020

Regularizing Meta-Learning via Gradient Dropout

With the growing attention on learning-to-learn new tasks using only a f...
research
01/26/2018

Recasting Gradient-Based Meta-Learning as Hierarchical Bayes

Meta-learning allows an intelligent agent to leverage prior learning epi...
research
11/20/2017

Situationally Aware Options

Hierarchical abstractions, also known as options -- a type of temporally...
research
02/27/2019

Provable Guarantees for Gradient-Based Meta-Learning

We study the problem of meta-learning through the lens of online convex ...
research
12/18/2017

A Bridge Between Hyperparameter Optimization and Larning-to-learn

We consider a class of a nested optimization problems involving inner an...
research
09/13/2023

Generalizable Neural Fields as Partially Observed Neural Processes

Neural fields, which represent signals as a function parameterized by a ...

Please sign up or login with your details

Forgot password? Click here to reset