Examining the Use of Temporal-Difference Incremental Delta-Bar-Delta for Real-World Predictive Knowledge Architectures

08/15/2019
by   Johannes Günther, et al.
6

Predictions and predictive knowledge have seen recent success in improving not only robot control but also other applications ranging from industrial process control to rehabilitation. A property that makes these predictive approaches well suited for robotics is that they can be learned online and incrementally through interaction with the environment. However, a remaining challenge for many prediction-learning approaches is an appropriate choice of prediction-learning parameters, especially parameters that control the magnitude of a learning machine's updates to its predictions (the learning rate or step size). To begin to address this challenge, we examine the use of online step-size adaptation using a sensor-rich robotic arm. Our method of choice, Temporal-Difference Incremental Delta-Bar-Delta (TIDBD), learns and adapts step sizes on a feature level; importantly, TIDBD allows step-size tuning and representation learning to occur at the same time. We show that TIDBD is a practical alternative for classic Temporal-Difference (TD) learning via an extensive parameter search. Both approaches perform comparably in terms of predicting future aspects of a robotic data stream. Furthermore, the use of a step-size adaptation method like TIDBD appears to allow a system to automatically detect and characterize common sensor failures in a robotic application. Together, these results promise to improve the ability of robotic devices to learn from interactions with their environments in a robust way, providing key capabilities for autonomous agents and robots.

READ FULL TEXT

page 1

page 5

page 6

page 8

page 10

research
04/10/2018

TIDBD: Adapting Temporal-difference Step-sizes Through Stochastic Meta-descent

In this paper, we introduce a method for adapting the step-sizes of temp...
research
03/08/2019

Learning Feature Relevance Through Step Size Adaptation in Temporal-Difference Learning

There is a long history of using meta learning as representation learnin...
research
05/10/2018

Metatrace: Online Step-size Tuning by Meta-gradient Descent for Reinforcement Learning Control

Reinforcement learning (RL) has had many successes in both "deep" and "s...
research
01/31/2022

Step-size Adaptation Using Exponentiated Gradient Updates

Optimizers like Adam and AdaGrad have been very successful in training l...
research
12/21/2014

Implicit Temporal Differences

In reinforcement learning, the TD(λ) algorithm is a fundamental policy e...
research
01/23/2020

What's a Good Prediction? Issues in Evaluating General Value Functions Through Error

Constructing and maintaining knowledge of the world is a central problem...
research
09/20/2018

Predicting Periodicity with Temporal Difference Learning

Temporal difference (TD) learning is an important approach in reinforcem...

Please sign up or login with your details

Forgot password? Click here to reset