QUOTA: The Quantile Option Architecture for Reinforcement Learning

11/05/2018

∙

In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL). In QUOTA, decision making is based on quantiles of a value distribution, not only the mean. QUOTA provides a new dimension for exploration via making use of both optimism and pessimism of a value distribution. We demonstrate the performance advantage of QUOTA in both challenging video games and physical robot simulators.

READ FULL TEXT

QUOTA: The Quantile Option Architecture for Reinforcement Learning

Sign in with Google

Consider DeepAI Pro