Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation

10/02/2018
by   John Martin, et al.
0

We present a method for Temporal Difference (TD) learning that addresses several challenges faced by robots learning to navigate in a marine environment. For improved data efficiency, our method reduces TD updates to Gaussian Process regression. To make predictions amenable to online settings, we introduce a sparse approximation with improved quality over current rejection-based sparse methods. We derive the predictive value function posterior and use the moments to obtain a new algorithm for model-free policy evaluation, SPGP-SARSA. With simple changes, we show SPGP-SARSA can be reduced to a model-based equivalent, SPGP-TD. We perform comprehensive simulation studies and also conduct physical learning trials with an underwater robot. Our results show SPGP-SARSA can outperform the state-of-the-art sparse method, replicate the prediction quality of its exact counterpart, and be applied to solve underwater navigation tasks.

READ FULL TEXT

page 6

page 8

research
11/17/2018

Recursive Sparse Pseudo-input Gaussian Process SARSA

The class of Gaussian Process (GP) methods for Temporal Difference learn...
research
01/28/2019

Online Estimation of Ocean Current from Sparse GPS Data for Underwater Vehicles

Underwater robots are subject to position drift due to the effect of oce...
research
12/08/2022

Monocular Camera and Single-Beam Sonar-Based Underwater Collision-Free Navigation with Domain Randomization

Underwater navigation presents several challenges, including unstructure...
research
09/15/2023

UIVNAV: Underwater Information-driven Vision-based Navigation via Imitation Learning

Autonomous navigation in the underwater environment is challenging due t...
research
06/05/2023

Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic

Learning high-quality Q-value functions plays a key role in the success ...
research
02/27/2023

Taylor TD-learning

Many reinforcement learning approaches rely on temporal-difference (TD) ...
research
11/22/2021

Optimistic Temporal Difference Learning for 2048

Temporal difference (TD) learning and its variants, such as multistage T...

Please sign up or login with your details

Forgot password? Click here to reset