Dream to Explore: Adaptive Simulations for Autonomous Systems

10/27/2021
by   Zahra Sheikhbahaee, et al.
18

One's ability to learn a generative model of the world without supervision depends on the extent to which one can construct abstract knowledge representations that generalize across experiences. To this end, capturing an accurate statistical structure from observational data provides useful inductive biases that can be transferred to novel environments. Here, we tackle the problem of learning to control dynamical systems by applying Bayesian nonparametric methods, which is applied to solve visual servoing tasks. This is accomplished by first learning a state space representation, then inferring environmental dynamics and improving the policies through imagined future trajectories. Bayesian nonparametric models provide automatic model adaptation, which not only combats underfitting and overfitting, but also allows the model's unbounded dimension to be both flexible and computationally tractable. By employing Gaussian processes to discover latent world dynamics, we mitigate common data efficiency issues observed in reinforcement learning and avoid introducing explicit model bias by describing the system's dynamics. Our algorithm jointly learns a world model and policy by optimizing a variational lower bound of a log-likelihood with respect to the expected free energy minimization objective function. Finally, we compare the performance of our model with the state-of-the-art alternatives for continuous control tasks in simulated environments.

READ FULL TEXT
research
05/07/2020

Planning from Images with Deep Latent Gaussian Process Dynamics

Planning is a powerful approach to control problems with known environme...
research
06/09/2020

Variational Model-based Policy Optimization

Model-based reinforcement learning (RL) algorithms allow us to combine m...
research
11/02/2020

Sample-efficient reinforcement learning using deep Gaussian processes

Reinforcement learning provides a framework for learning to control whic...
research
12/20/2021

Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization

Deep reinforcement learning algorithms can perform poorly in real-world ...
research
12/13/2020

Reinforcement Learning with Subspaces using Free Energy Paradigm

In large-scale problems, standard reinforcement learning algorithms suff...
research
07/11/2018

A Hierarchical Bayesian Linear Regression Model with Local Features for Stochastic Dynamics Approximation

One of the challenges in model-based control of stochastic dynamical sys...

Please sign up or login with your details

Forgot password? Click here to reset