Data-Efficient Learning of Feedback Policies from Image Pixels using Deep Dynamical Models

10/08/2015
by   John-Alexander M. Assael, et al.
0

Data-efficient reinforcement learning (RL) in continuous state-action spaces using very high-dimensional observations remains a key challenge in developing fully autonomous systems. We consider a particularly important instance of this challenge, the pixels-to-torques problem, where an RL agent learns a closed-loop control policy ("torques") from pixel information only. We introduce a data-efficient, model-based reinforcement learning algorithm that learns such a closed-loop policy directly from pixel information. The key ingredient is a deep dynamical model for learning a low-dimensional feature embedding of images jointly with a predictive model in this low-dimensional feature space. Joint learning is crucial for long-term predictions, which lie at the core of the adaptive nonlinear model predictive control strategy that we use for closed-loop control. Compared to state-of-the-art RL methods for continuous states and actions, our approach learns quickly, scales to high-dimensional state spaces, is lightweight and an important step toward fully autonomous end-to-end learning from pixels to torques.

READ FULL TEXT
research
02/08/2015

From Pixels to Torques: Policy Learning with Deep Dynamical Models

Data-efficient learning in continuous state-action spaces using very hig...
research
09/20/2019

NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement Learning

One of the key challenges arising when compilers vectorize loops for tod...
research
05/06/2021

A Reinforcement Learning-based Economic Model Predictive Control Framework for Autonomous Operation of Chemical Reactors

Economic model predictive control (EMPC) is a promising methodology for ...
research
03/25/2019

On the use of Deep Autoencoders for Efficient Embedded Reinforcement Learning

In autonomous embedded systems, it is often vital to reduce the amount o...
research
01/09/2020

Closed-loop deep learning: generating forward models with back-propagation

A reflex is a simple closed loop control approach which tries to minimis...
research
06/17/2023

The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions

Reinforcement learning (RL) algorithms have proven transformative in a r...
research
04/11/2016

A statistical learning strategy for closed-loop control of fluid flows

This work discusses a closed-loop control strategy for complex systems u...

Please sign up or login with your details

Forgot password? Click here to reset