Comparison of Reinforcement Learning algorithms applied to the Cart Pole problem

10/03/2018
by   Savinay Nagendra, et al.
0

Designing optimal controllers continues to be challenging as systems are becoming complex and are inherently nonlinear. The principal advantage of reinforcement learning (RL) is its ability to learn from the interaction with the environment and provide optimal control strategy. In this paper, RL is explored in the context of control of the benchmark cartpole dynamical system with no prior knowledge of the dynamics. RL algorithms such as temporal-difference, policy gradient actor-critic, and value function approximation are compared in this context with the standard LQR solution. Further, we propose a novel approach to integrate RL and swing-up controllers.

READ FULL TEXT
06/16/2020

Parameter-based Value Functions

Learning value functions off-policy is at the core of modern Reinforceme...
06/12/2021

Recomposing the Reinforcement Learning Building Blocks with Hypernetworks

The Reinforcement Learning (RL) building blocks, i.e. Q-functions and po...
12/29/2020

Reinforcement Learning for Control of Valves

This paper compares reinforcement learning (RL) with PID (proportional-i...
07/11/2021

Out-of-Distribution Dynamics Detection: RL-Relevant Benchmarks and Results

We study the problem of out-of-distribution dynamics (OODD) detection, w...
11/21/2020

On the Convergence of Reinforcement Learning

We consider the problem of Reinforcement Learning for nonlinear stochast...
02/25/2021

CPG-ACTOR: Reinforcement Learning for Central Pattern Generators

Central Pattern Generators (CPGs) have several properties desirable for ...
09/09/2020

DyNODE: Neural Ordinary Differential Equations for Dynamics Modeling in Continuous Control

We present a novel approach (DyNODE) that captures the underlying dynami...