MBMF: Model-Based Priors for Model-Free Reinforcement Learning

by   Somil Bansal, et al.
berkeley college

Reinforcement Learning is divided in two main paradigms: model-free and model-based. Each of these two paradigms has strengths and limitations, and has been successfully applied to real world domains that are appropriate to its corresponding strengths. In this paper, we present a new approach aimed at bridging the gap between these two paradigms. We aim to take the best of the two paradigms and combine them in an approach that is at the same time data-efficient and cost-savvy. We do so by learning a probabilistic dynamics model and leveraging it as a prior for the intertwined model-free optimization. As a result, our approach can exploit the generality and structure of the dynamics model, but is also capable of ignoring its inevitable inaccuracies, by directly incorporating the evidence provided by the direct observation of the cost. Preliminary results demonstrate that our approach outperforms purely model-based and model-free approaches, as well as the approach of simply switching from a model-based to a model-free setting.


Policy Optimization with Model-based Explorations

Model-free reinforcement learning methods such as the Proximal Policy Op...

Model-Based and Model-Free point prediction algorithms for locally stationary random fields

The Model-free Prediction Principle has been successfully applied to gen...

Behaviorally Grounded Model-Based and Model Free Cost Reduction in a Simulated Multi-Echelon Supply Chain

Amplification and phase shift in ordering signals, commonly referred to ...

Model-free, Model-based, and General Intelligence

During the 60s and 70s, AI researchers explored intuitions about intelli...

Goal recognition via model-based and model-free techniques

Goal recognition aims at predicting human intentions from a trace of obs...

Model-agnostic network inference enhancement from noisy measurements via curriculum learning

Noise is a pervasive element within real-world measurement data, signifi...

Please sign up or login with your details

Forgot password? Click here to reset