Adaptive Online Planning for Continual Lifelong Learning

12/03/2019
by   Kevin Lu, et al.
31

We study learning control in an online lifelong learning scenario, where mistakes can compound catastrophically into the future and the underlying dynamics of the environment may change. Traditional model-free policy learning methods have achieved successes in difficult tasks due to their broad flexibility, and capably condense broad experiences into compact networks, but struggle in this setting, as they can activate failure modes early in their lifetimes which are difficult to recover from and face performance degradation as dynamics change. On the other hand, model-based planning methods learn and adapt quickly, but require prohibitive levels of computational resources. Under constrained computation limits, the agent must allocate its resources wisely, which requires the agent to understand both its own performance and the current state of the environment: knowing that its mastery over control in the current dynamics is poor, the agent should dedicate more time to planning. We present a new algorithm, Adaptive Online Planning (AOP), that achieves strong performance in this setting by combining model-based planning with model-free learning. By measuring the performance of the planner and the uncertainty of the model-free components, AOP is able to call upon more extensive planning only when necessary, leading to reduced computation times. We show that AOP gracefully deals with novel situations, adapting behaviors and policies effectively in the face of unpredictable changes in the world – challenges that a continual learning agent naturally faces over an extended lifetime – even when traditional reinforcement learning methods fail.

READ FULL TEXT

page 11

page 14

research
08/23/2020

Learning Off-Policy with Online Planning

We propose Learning Off-Policy with Online Planning (LOOP), combining th...
research
07/11/2020

Control as Hybrid Inference

The field of reinforcement learning can be split into model-based and mo...
research
09/29/2021

Learning Dynamics Models for Model Predictive Agents

Model-Based Reinforcement Learning involves learning a dynamics model fr...
research
10/14/2019

Bootstrapping the Expressivity with Model-based Planning

We compare the model-free reinforcement learning with the model-based ap...
research
01/11/2019

An investigation of model-free planning

The field of reinforcement learning (RL) is facing increasingly challeng...
research
10/07/2021

Evaluating model-based planning and planner amortization for continuous control

There is a widespread intuition that model-based control methods should ...
research
01/10/2013

Planning by Prioritized Sweeping with Small Backups

Efficient planning plays a crucial role in model-based reinforcement lea...

Please sign up or login with your details

Forgot password? Click here to reset