Faster and More Accurate Learning with Meta Trace Adaptation

04/25/2019
by   Mingde Zhao, et al.
0

Learning speed and accuracy are of universal interest for reinforcement learning problems. In this paper, we investigate meta-learning approaches for adaptation of the trace decay parameter λ used in TD(λ), from the perspective of optimizing a bias-variance tradeoff. We propose an off-policy applicable method of meta-learning the λ parameters via optimizing a metaobjective with effcient incremental updates. The proposed trust-region style algorithm, under proper assumptions, is shown to be equivalent to optimizing the bias-variance tradeoff for the overall target for all states. In experiments, we validate the effectiveness of the proposed method MTA showing its significantly faster and more accurate learning patterns compared to the compared methods and baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/22/2022

An Investigation of the Bias-Variance Tradeoff in Meta-Gradients

Meta-gradients provide a general approach for optimizing the meta-parame...
research
06/13/2022

Faster Optimization-Based Meta-Learning Adaptation Phase

Neural networks require a large amount of annotated data to learn. Meta-...
research
10/15/2020

Provably Faster Algorithms for Bilevel Optimization and Applications to Meta-Learning

Bilevel optimization has arisen as a powerful tool for many machine lear...
research
06/16/2020

META-Learning Eligibility Traces for More Sample Efficient Temporal Difference Learning

Temporal-Difference (TD) learning is a standard and very successful rein...
research
07/11/2020

Online Parameter-Free Learning of Multiple Low Variance Tasks

We propose a method to learn a common bias vector for a growing sequence...
research
04/30/2021

Faster Meta Update Strategy for Noise-Robust Deep Learning

It has been shown that deep neural networks are prone to overfitting on ...
research
04/29/2022

Line of Sight Curvature for Missile Guidance using Reinforcement Meta-Learning

We use reinforcement meta learning to optimize a line of sight curvature...

Please sign up or login with your details

Forgot password? Click here to reset