DeepAI AI Chat
Log In Sign Up

Meta-learning of Sequential Strategies

by   Pedro A. Ortega, et al.

In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal predictors and reinforcement learners which behave as if they had a probabilistic model that allowed them to efficiently exploit task structure. Furthermore, we recast memory-based meta-learning within a Bayesian framework, showing that the meta-learned strategies are near-optimal because they amortize Bayes-filtered data, where the adaptation is implemented in the memory dynamics as a state-machine of sufficient statistics. Essentially, memory-based meta-learning translates the hard problem of probabilistic sequential inference into a regression problem.


page 1

page 2

page 3

page 4


Meta-trained agents implement Bayes-optimal agents

Memory-based meta-learning is a powerful technique to build agents that ...

Adaptive Meta-Learning for Identification of Rover-Terrain Dynamics

Rovers require knowledge of terrain to plan trajectories that maximize s...

Memory-Based Meta-Learning on Non-Stationary Distributions

Memory-based meta-learning is a technique for approximating Bayes-optima...

Meta-learners' learning dynamics are unlike learners'

Meta-learning is a tool that allows us to build sample-efficient learnin...

Learning not to learn: Nature versus nurture in silico

Animals are equipped with a rich innate repertoire of sensory, behaviora...

Deep Interactive Bayesian Reinforcement Learning via Meta-Learning

Agents that interact with other agents often do not know a priori what t...

Meta-Learned Models of Cognition

Meta-learning is a framework for learning learning algorithms through re...

Code Repositories


Implementation of 'RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning' for multi-armed bandit problems

view repo


Implementation of 'RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning'

view repo