A Discourse on MetODS: Meta-Optimized Dynamical Synapses for Meta-Reinforcement Learning

02/04/2022
by   Mathieu Chalvidal, et al.
0

Recent meta-reinforcement learning work has emphasized the importance of mnemonic control for agents to quickly assimilate relevant experience in new contexts and suitably adapt their policy. However, what computational mechanisms support flexible behavioral adaptation from past experience remains an open question. Inspired by neuroscience, we propose MetODS (for Meta-Optimized Dynamical Synapses), a broadly applicable model of meta-reinforcement learning which leverages fast synaptic dynamics influenced by action-reward feedback. We develop a theoretical interpretation of MetODS as a model learning powerful control rules in the policy space and demonstrate empirically that robust reinforcement learning programs emerge spontaneously from them. We further propose a formalism which efficiently optimizes the meta-parameters governing MetODS synaptic processes. In multiple experiments and domains, MetODS outperforms or compares favorably with previous meta-reinforcement learning approaches. Our agents can perform one-shot learning, approaches optimal exploration/exploitation strategies, generalize navigation principles to unseen environments and demonstrate a strong ability to learn adaptive motor policies.

READ FULL TEXT
research
06/12/2020

Meta-Reinforcement Learning Robust to Distributional Shift via Model Identification and Experience Relabeling

Reinforcement learning algorithms can acquire policies for complex tasks...
research
09/30/2019

Efficient meta reinforcement learning via meta goal generation

Meta reinforcement learning (meta-RL) is able to accelerate the acquisit...
research
05/06/2020

Safe Reinforcement Learning through Meta-learned Instincts

An important goal in reinforcement learning is to create agents that can...
research
07/06/2020

Meta-Learning through Hebbian Plasticity in Random Networks

Lifelong learning and adaptability are two defining aspects of biologica...
research
09/14/2021

Few-shot Quality-Diversity Optimisation

In the past few years, a considerable amount of research has been dedica...
research
10/06/2022

Meta Reinforcement Learning for Optimal Design of Legged Robots

The process of robot design is a complex task and the majority of design...
research
09/10/2020

Importance Weighted Policy Learning and Adaption

The ability to exploit prior experience to solve novel problems rapidly ...

Please sign up or login with your details

Forgot password? Click here to reset