Physical Derivatives: Computing policy gradients by physical forward-propagation

01/15/2022
by   Arash Mehrjou, et al.
5

Model-free and model-based reinforcement learning are two ends of a spectrum. Learning a good policy without a dynamic model can be prohibitively expensive. Learning the dynamic model of a system can reduce the cost of learning the policy, but it can also introduce bias if it is not accurate. We propose a middle ground where instead of the transition model, the sensitivity of the trajectories with respect to the perturbation of the parameters is learned. This allows us to predict the local behavior of the physical system around a set of nominal policies without knowing the actual model. We assay our method on a custom-built physical robot in extensive experiments and show the feasibility of the approach in practice. We investigate potential challenges when applying our method to physical systems and propose solutions to each of them.

READ FULL TEXT
research
08/30/2022

Model-Based Reinforcement Learning with SINDy

We draw on the latest advancements in the physics community to propose a...
research
07/19/2013

Model-Based Policy Gradients with Parameter-Based Exploration by Least-Squares Conditional Density Estimation

The goal of reinforcement learning (RL) is to let an agent learn an opti...
research
03/25/2021

Model Predictive Actor-Critic: Accelerating Robot Skill Acquisition with Deep Reinforcement Learning

Substantial advancements to model-based reinforcement learning algorithm...
research
07/09/2022

Optimizing Bipedal Maneuvers of Single Rigid-Body Models for Reinforcement Learning

In this work, we propose a method to generate reduced-order model refere...
research
10/15/2021

On-Policy Model Errors in Reinforcement Learning

Model-free reinforcement learning algorithms can compute policy gradient...
research
07/06/2018

A survey on policy search algorithms for learning robot controllers in a handful of trials

Most policy search algorithms require thousands of training episodes to ...
research
03/14/2021

Progressive residual learning for single image dehazing

The recent physical model-free dehazing methods have achieved state-of-t...

Please sign up or login with your details

Forgot password? Click here to reset