Critic PI2: Master Continuous Planning via Policy Improvement with Path Integrals and Deep Actor-Critic Reinforcement Learning

11/13/2020
by   Jiajun Fan, et al.
0

Constructing agents with planning capabilities has long been one of the main challenges in the pursuit of artificial intelligence. Tree-based planning methods from AlphaGo to Muzero have enjoyed huge success in discrete domains, such as chess and Go. Unfortunately, in real-world applications like robot control and inverted pendulum, whose action space is normally continuous, those tree-based planning techniques will be struggling. To address those limitations, in this paper, we present a novel model-based reinforcement learning frameworks called Critic PI2, which combines the benefits from trajectory optimization, deep actor-critic learning, and model-based reinforcement learning. Our method is evaluated for inverted pendulum models with applicability to many continuous control systems. Extensive experiments demonstrate that Critic PI2 achieved a new state of the art in a range of challenging continuous domains. Furthermore, we show that planning with a critic significantly increases the sample efficiency and real-time performance. Our work opens a new direction toward learning the components of a model-based planning system and how to use them.

READ FULL TEXT
research
04/29/2020

How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization

Deterministic-policy actor-critic algorithms for continuous control impr...
research
06/12/2020

Continuous Control for Searching and Planning with a Learned Model

Decision-making agents with planning capabilities have achieved huge suc...
research
11/19/2019

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

Constructing agents with planning capabilities has long been one of the ...
research
06/30/2023

λ-AC: Learning latent decision-aware models for reinforcement learning in continuous state-spaces

The idea of decision-aware model learning, that models should be accurat...
research
01/10/2013

Planning by Prioritized Sweeping with Small Backups

Efficient planning plays a crucial role in model-based reinforcement lea...
research
07/30/2019

Control of nonlinear, complex and black-boxed greenhouse system with reinforcement learning

Modern control theories such as systems engineering approaches try to so...
research
09/07/2023

Hybrid of representation learning and reinforcement learning for dynamic and complex robotic motion planning

Motion planning is the soul of robot decision making. Classical planning...

Please sign up or login with your details

Forgot password? Click here to reset