Control-Tutored Reinforcement Learning: Towards the Integration of Data-Driven and Model-Based Control

12/11/2021
by   F. De Lellis, et al.
0

We present an architecture where a feedback controller derived on an approximate model of the environment assists the learning process to enhance its data efficiency. This architecture, which we term as Control-Tutored Q-learning (CTQL), is presented in two alternative flavours. The former is based on defining the reward function so that a Boolean condition can be used to determine when the control tutor policy is adopted, while the latter, termed as probabilistic CTQL (pCTQL), is instead based on executing calls to the tutor with a certain probability during learning. Both approaches are validated, and thoroughly benchmarked against Q-Learning, by considering the stabilization of an inverted pendulum as defined in OpenAI Gym as a representative problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/12/2020

Tutoring Reinforcement Learning via Feedback Control

We introduce a control-tutored reinforcement learning (CTRL) algorithm. ...
research
02/01/2023

Internally Rewarded Reinforcement Learning

We study a class of reinforcement learning problems where the reward sig...
research
06/08/2018

Fidelity-based Probabilistic Q-learning for Control of Quantum Systems

The balance between exploration and exploitation is a key problem for re...
research
09/02/2020

Nonholonomic Yaw Control of an Underactuated Flying Robot with Model-based Reinforcement Learning

Nonholonomic control is a candidate to control nonlinear systems with pa...
research
08/20/2021

Plug and Play, Model-Based Reinforcement Learning

Sample-efficient generalisation of reinforcement learning approaches hav...
research
02/10/2015

Gaussian Processes for Data-Efficient Learning in Robotics and Control

Autonomous learning has been a promising direction in control and roboti...
research
03/22/2021

D3PI: Data-Driven Distributed Policy Iteration for Homogeneous Interconnected Systems

Control of large-scale networked systems often necessitates the availabi...

Please sign up or login with your details

Forgot password? Click here to reset