Continuous Control for Searching and Planning with a Learned Model

06/12/2020
by   Xuxi Yang, et al.
0

Decision-making agents with planning capabilities have achieved huge success in the challenging domain like Chess, Shogi, and Go. In an effort to generalize the planning ability to the more general tasks where the environment dynamics are not available to the agent, researchers proposed the MuZero algorithm that can learn the dynamical model through the interactions with the environment. In this paper, we provide a way and the necessary theoretical results to extend the MuZero algorithm to more generalized environments with continuous action space. Through numerical results on two relatively low-dimensional MuJoCo environments, we show the proposed algorithm outperforms the soft actor-critic (SAC) algorithm, a state-of-the-art model-free deep reinforcement learning algorithm.

READ FULL TEXT
research
11/13/2020

Critic PI2: Master Continuous Planning via Policy Improvement with Path Integrals and Deep Actor-Critic Reinforcement Learning

Constructing agents with planning capabilities has long been one of the ...
research
12/06/2017

A Novel Model for Arbitration between Planning and Habitual Control Systems

It is well established that humans decision making and instrumental cont...
research
09/09/2015

Continuous control with deep reinforcement learning

We adapt the ideas underlying the success of Deep Q-Learning to the cont...
research
06/30/2023

λ-AC: Learning latent decision-aware models for reinforcement learning in continuous state-spaces

The idea of decision-aware model learning, that models should be accurat...
research
06/16/2021

Solving Continuous Control with Episodic Memory

Episodic memory lets reinforcement learning algorithms remember and expl...
research
12/01/2014

Game-theoretical control with continuous action sets

Motivated by the recent applications of game-theoretical learning techni...
research
02/18/2021

Learning Memory-Dependent Continuous Control from Demonstrations

Efficient exploration has presented a long-standing challenge in reinfor...

Please sign up or login with your details

Forgot password? Click here to reset