Modular Transfer Learning with Transition Mismatch Compensation for Excessive Disturbance Rejection

07/29/2020
by   Tianming Wang, et al.
0

Underwater robots in shallow waters usually suffer from strong wave forces, which may frequently exceed robot's control constraints. Learning-based controllers are suitable for disturbance rejection control, but the excessive disturbances heavily affect the state transition in Markov Decision Process (MDP) or Partially Observable Markov Decision Process (POMDP). Also, pure learning procedures on targeted system may encounter damaging exploratory actions or unpredictable system variations, and training exclusively on a prior model usually cannot address model mismatch from the targeted system. In this paper, we propose a transfer learning framework that adapts a control policy for excessive disturbance rejection of an underwater robot under dynamics model mismatch. A modular network of learning policies is applied, composed of a Generalized Control Policy (GCP) and an Online Disturbance Identification Model (ODI). GCP is first trained over a wide array of disturbance waveforms. ODI then learns to use past states and actions of the system to predict the disturbance waveforms which are provided as input to GCP (along with the system state). A transfer reinforcement learning algorithm using Transition Mismatch Compensation (TMC) is developed based on the modular architecture, that learns an additional compensatory policy through minimizing mismatch of transitions predicted by the two dynamics models of the source and target tasks. We demonstrated on a pose regulation task in simulation that TMC is able to successfully reject the disturbances and stabilize the robot under an empirical model of the robot system, meanwhile improve sample efficiency.

READ FULL TEXT

page 1

page 8

research
06/15/2023

DiAReL: Reinforcement Learning with Disturbance Awareness for Robust Sim2Real Policy Transfer in Robot Control

Delayed Markov decision processes fulfill the Markov property by augment...
research
03/28/2018

Reinforcement learning for non-prehensile manipulation: Transfer from simulation to physical system

Reinforcement learning has emerged as a promising methodology for traini...
research
12/09/2011

KL-learning: Online solution of Kullback-Leibler control problems

We introduce a stochastic approximation method for the solution of an er...
research
02/13/2019

Sample-Optimal Parametric Q-Learning with Linear Transition Models

Consider a Markov decision process (MDP) that admits a set of state-acti...
research
09/16/2022

Learning Policies for Continuous Control via Transition Models

It is doubtful that animals have perfect inverse models of their limbs (...
research
10/21/2019

Modelling Generalized Forces with Reinforcement Learning for Sim-to-Real Transfer

Learning robotic control policies in the real world gives rise to challe...
research
05/16/2020

Lifelong Control of Off-grid Microgrid with Model Based Reinforcement Learning

The lifelong control problem of an off-grid microgrid is composed of two...

Please sign up or login with your details

Forgot password? Click here to reset