Uncertainty-sensitive Learning and Planning with Ensembles

12/19/2019
by   Piotr Miłoś, et al.
21

We propose a reinforcement learning framework for discrete environments in which an agent makes both strategic and tactical decisions. The former manifests itself through the use of value function, while the latter is powered by a tree search planner. These tools complement each other. The planning module performs a local what-if analysis, which allows to avoid tactical pitfalls and boost backups of the value function. The value function, being global in nature, compensates for inherent locality of the planner. In order to further solidify this synergy, we introduce an exploration mechanism with two distinctive components: uncertainty modelling and risk measurement. To model the uncertainty we use value function ensembles, and to reflect risk we use propose several functionals that summarize the implied by the ensemble. We show that our method performs well on hard exploration environments: Deep-sea, toy Montezuma's Revenge, and Sokoban. In all the cases, we obtain speed-up in learning and boost in performance.

READ FULL TEXT

page 6

page 8

research
11/05/2019

Robo-advising: Learning Investor's Risk Preferences via Portfolio Choices

We introduce a reinforcement learning framework for retail robo-advising...
research
11/05/2018

Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control

We propose a plan online and learn offline (POLO) framework for the sett...
research
12/23/2019

Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

It is well known that quantifying uncertainty in the action-value estima...
research
06/14/2016

Digits that are not: Generating new types through deep neural nets

For an artificial creative agent, an essential driver of the search for ...
research
04/17/2018

Leveraging Statistical Multi-Agent Online Planning with Emergent Value Function Approximation

Making decisions is a great challenge in distributed autonomous environm...
research
02/15/2019

Bi-directional Value Learning for Risk-aware Planning Under Uncertainty

Decision-making under uncertainty is a crucial ability for autonomous sy...
research
03/14/2019

Reinforcement Learning with Dynamic Boltzmann Softmax Updates

Value function estimation is an important task in reinforcement learning...

Please sign up or login with your details

Forgot password? Click here to reset