Continuous control with deep reinforcement learning

09/09/2015
by   Timothy P. Lillicrap, et al.
0

We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain. We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. Using the same learning algorithm, network architecture and hyper-parameters, our algorithm robustly solves more than 20 simulated physics tasks, including classic problems such as cartpole swing-up, dexterous manipulation, legged locomotion and car driving. Our algorithm is able to find policies whose performance is competitive with those found by a planning algorithm with full access to the dynamics of the domain and its derivatives. We further demonstrate that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.

READ FULL TEXT
research
11/28/2018

Deep Reinforcement Learning for Autonomous Driving

Reinforcement learning has steadily improved and outperform human in lot...
research
06/12/2020

Continuous Control for Searching and Planning with a Learned Model

Decision-making agents with planning capabilities have achieved huge suc...
research
07/06/2018

End-to-End Race Driving with Deep Reinforcement Learning

We present research using the latest reinforcement learning algorithm fo...
research
09/30/2022

Efficiently Learning Small Policies for Locomotion and Manipulation

Neural control of memory-constrained, agile robots requires small, yet h...
research
06/29/2021

A Convergent and Efficient Deep Q Network Algorithm

Despite the empirical success of the deep Q network (DQN) reinforcement ...
research
11/21/2017

Deterministic Policy Optimization by Combining Pathwise and Score Function Estimators for Discrete Action Spaces

Policy optimization methods have shown great promise in solving complex ...
research
03/20/2017

QMDP-Net: Deep Learning for Planning under Partial Observability

This paper introduces the QMDP-net, a neural network architecture for pl...

Please sign up or login with your details

Forgot password? Click here to reset