Log In Sign Up

Dream to Explore: Adaptive Simulations for Autonomous Systems

by   Zahra Sheikhbahaee, et al.

One's ability to learn a generative model of the world without supervision depends on the extent to which one can construct abstract knowledge representations that generalize across experiences. To this end, capturing an accurate statistical structure from observational data provides useful inductive biases that can be transferred to novel environments. Here, we tackle the problem of learning to control dynamical systems by applying Bayesian nonparametric methods, which is applied to solve visual servoing tasks. This is accomplished by first learning a state space representation, then inferring environmental dynamics and improving the policies through imagined future trajectories. Bayesian nonparametric models provide automatic model adaptation, which not only combats underfitting and overfitting, but also allows the model's unbounded dimension to be both flexible and computationally tractable. By employing Gaussian processes to discover latent world dynamics, we mitigate common data efficiency issues observed in reinforcement learning and avoid introducing explicit model bias by describing the system's dynamics. Our algorithm jointly learns a world model and policy by optimizing a variational lower bound of a log-likelihood with respect to the expected free energy minimization objective function. Finally, we compare the performance of our model with the state-of-the-art alternatives for continuous control tasks in simulated environments.


Planning from Images with Deep Latent Gaussian Process Dynamics

Planning is a powerful approach to control problems with known environme...

Variational Model-based Policy Optimization

Model-based reinforcement learning (RL) algorithms allow us to combine m...

Sample-efficient reinforcement learning using deep Gaussian processes

Reinforcement learning provides a framework for learning to control whic...

Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization

Deep reinforcement learning algorithms can perform poorly in real-world ...

Reinforcement Learning with Subspaces using Free Energy Paradigm

In large-scale problems, standard reinforcement learning algorithms suff...

A Hierarchical Bayesian Linear Regression Model with Local Features for Stochastic Dynamics Approximation

One of the challenges in model-based control of stochastic dynamical sys...

A Biologically-Inspired Dual Stream World Model

The medial temporal lobe (MTL), a brain region containing the hippocampu...