Model-based Policy Search for Partially Measurable Systems

01/21/2021
by   Fabio Amadio, et al.
0

In this paper, we propose a Model-Based Reinforcement Learning (MBRL) algorithm for Partially Measurable Systems (PMS), i.e., systems where the state can not be directly measured, but must be estimated through proper state observers. The proposed algorithm, named Monte Carlo Probabilistic Inference for Learning COntrol for Partially Measurable Systems (MC-PILCO4PMS), relies on Gaussian Processes (GPs) to model the system dynamics, and on a Monte Carlo approach to update the policy parameters. W.r.t. previous GP-based MBRL algorithms, MC-PILCO4PMS models explicitly the presence of state observers during policy optimization, allowing to deal PMS. The effectiveness of the proposed algorithm has been tested both in simulation and in two real systems.

READ FULL TEXT

page 8

page 9

research
01/28/2021

Model-Based Policy Search Using Monte Carlo Gradient Estimation with Real Systems Application

In this paper, we present a Model-Based Reinforcement Learning algorithm...
research
06/27/2012

Monte Carlo Bayesian Reinforcement Learning

Bayesian reinforcement learning (BRL) encodes prior knowledge of the wor...
research
09/26/2022

FORESEE: Model-based Reinforcement Learning using Unscented Transform with application to Tuning of Control Barrier Functions

In this paper, we introduce a novel online model-based reinforcement lea...
research
02/28/2022

GPU-Accelerated Policy Optimization via Batch Automatic Differentiation of Gaussian Processes for Real-World Control

The ability of Gaussian processes (GPs) to predict the behavior of dynam...
research
01/30/2023

Learning Control from Raw Position Measurements

We propose a Model-Based Reinforcement Learning (MBRL) algorithm named V...
research
10/19/2012

Monte Carlo Matrix Inversion Policy Evaluation

In 1950, Forsythe and Leibler (1950) introduced a statistical technique ...
research
02/11/2022

Exploration of Differentiability in a Proton Computed Tomography Simulation Framework

Objective. Algorithmic differentiation (AD) can be a useful technique to...

Please sign up or login with your details

Forgot password? Click here to reset