DeepAI AI Chat
Log In Sign Up

Continuous control with deep reinforcement learning

by   Timothy P. Lillicrap, et al.

We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain. We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. Using the same learning algorithm, network architecture and hyper-parameters, our algorithm robustly solves more than 20 simulated physics tasks, including classic problems such as cartpole swing-up, dexterous manipulation, legged locomotion and car driving. Our algorithm is able to find policies whose performance is competitive with those found by a planning algorithm with full access to the dynamics of the domain and its derivatives. We further demonstrate that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.


Deep Reinforcement Learning for Autonomous Driving

Reinforcement learning has steadily improved and outperform human in lot...

Continuous Control for Searching and Planning with a Learned Model

Decision-making agents with planning capabilities have achieved huge suc...

End-to-End Race Driving with Deep Reinforcement Learning

We present research using the latest reinforcement learning algorithm fo...

Efficiently Learning Small Policies for Locomotion and Manipulation

Neural control of memory-constrained, agile robots requires small, yet h...

A Convergent and Efficient Deep Q Network Algorithm

Despite the empirical success of the deep Q network (DQN) reinforcement ...

Deterministic Policy Optimization by Combining Pathwise and Score Function Estimators for Discrete Action Spaces

Policy optimization methods have shown great promise in solving complex ...

QMDP-Net: Deep Learning for Planning under Partial Observability

This paper introduces the QMDP-net, a neural network architecture for pl...

Code Repositories


Reimplementation of DDPG(Continuous Control with Deep Reinforcement Learning) based on OpenAI Gym + Tensorflow

view repo


Continuous control with deep reinforcement learning - Deep Deterministic Policy Gradient (DDPG) algorithm implemented in OpenAI Gym environments

view repo


Deterministic Policy Gradient using torch7

view repo


Using deep reinforcement learning (DDPG & A3C) to solve Acrobot

view repo


DDPG on OpenAI Gym Pendulum

view repo