Control as Hybrid Inference

by   Alexander Tschantz, et al.

The field of reinforcement learning can be split into model-based and model-free methods. Here, we unify these approaches by casting model-free policy optimisation as amortised variational inference, and model-based planning as iterative variational inference, within a `control as hybrid inference' (CHI) framework. We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference. Using a didactic experiment, we demonstrate that the proposed algorithm operates in a model-based manner at the onset of learning, before converging to a model-free algorithm once sufficient data have been collected. We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines. CHI thus provides a principled framework for harnessing the sample efficiency of model-based planning while retaining the asymptotic performance of model-free policy optimisation.


page 1

page 2

page 3

page 4


Value-of-Information based Arbitration between Model-based and Model-free Control

There have been numerous attempts in explaining the general learning beh...

Reinforcement Learning as Iterative and Amortised Inference

There are several ways to categorise reinforcement learning (RL) algorit...

Optimising Stochastic Routing for Taxi Fleets with Model Enhanced Reinforcement Learning

The future of mobility-as-a-Service (Maas)should embrace an integrated s...

Learning Extreme Hummingbird Maneuvers on Flapping Wing Robots

Biological studies show that hummingbirds can perform extreme aerobatic ...

Model-agnostic network inference enhancement from noisy measurements via curriculum learning

Noise is a pervasive element within real-world measurement data, signifi...

Adaptive Online Planning for Continual Lifelong Learning

We study learning control in an online lifelong learning scenario, where...

Please sign up or login with your details

Forgot password? Click here to reset