Equivalence Between Wasserstein and Value-Aware Model-based Reinforcement Learning

06/01/2018

∙

Learning a generative model is a key component of model-based reinforcement learning. Though learning a good model in the tabular setting is a simple task, learning a useful model in the approximate setting is challenging. Recently Farahmand et al. (2017) proposed a value-aware (VAML) objective that captures the structure of value function during model learning. Using tools from Lipschitz continuity, we show that minimizing the VAML objective is in fact equivalent to minimizing the Wasserstein metric.

READ FULL TEXT

Equivalence Between Wasserstein and Value-Aware Model-based Reinforcement Learning

Sign in with Google

Consider DeepAI Pro