Performance-Weighed Policy Sampling for Meta-Reinforcement Learning

12/10/2020
by   Ibrahim Ahmed, et al.
0

This paper discusses an Enhanced Model-Agnostic Meta-Learning (E-MAML) algorithm that generates fast convergence of the policy function from a small number of training examples when applied to new learning tasks. Built on top of Model-Agnostic Meta-Learning (MAML), E-MAML maintains a set of policy parameters learned in the environment for previous tasks. We apply E-MAML to developing reinforcement learning (RL)-based online fault tolerant control schemes for dynamic systems. The enhancement is applied when a new fault occurs, to re-initialize the parameters of a new RL policy that achieves faster adaption with a small number of samples of system behavior with the new fault. This replaces the random task sampling step in MAML. Instead, it exploits the extant previously generated experiences of the controller. The enhancement is sampled to maximally span the parameter space to facilitate adaption to the new fault. We demonstrate the performance of our approach combining E-MAML with proximal policy optimization (PPO) on the well-known cart pole example, and then on the fuel transfer system of an aircraft.

READ FULL TEXT
research
09/26/2020

Complementary Meta-Reinforcement Learning for Fault-Adaptive Control

Faults are endemic to all systems. Adaptive fault-tolerant control maint...
research
02/11/2023

A large parametrized space of meta-reinforcement learning tasks

We describe a parametrized space for simple meta-reinforcement-learning ...
research
02/12/2020

Provably Convergent Policy Gradient Methods for Model-Agnostic Meta-Reinforcement Learning

We consider Model-Agnostic Meta-Learning (MAML) methods for Reinforcemen...
research
01/01/2021

B-SMALL: A Bayesian Neural Network approach to Sparse Model-Agnostic Meta-Learning

There is a growing interest in the learning-to-learn paradigm, also know...
research
01/18/2022

System-Agnostic Meta-Learning for MDP-based Dynamic Scheduling via Descriptive Policy

Dynamic scheduling is an important problem in applications from queuing ...
research
10/16/2019

Model-Agnostic Meta-Learning using Runge-Kutta Methods

Meta-learning has emerged as an important framework for learning new tas...
research
04/16/2020

Analyzing Reinforcement Learning Benchmarks with Random Weight Guessing

We propose a novel method for analyzing and visualizing the complexity o...

Please sign up or login with your details

Forgot password? Click here to reset