The Differentiable Cross-Entropy Method

09/27/2019
by   Brandon Amos, et al.
28

We study the Cross-Entropy Method (CEM) for the non-convex optimization of a continuous and parameterized objective function and introduce a differentiable variant (DCEM) that enables us to differentiate the output of CEM with respect to the objective function's parameters. In the machine learning setting this brings CEM inside of the end-to-end learning pipeline where this has otherwise been impossible. We show applications in a synthetic energy-based structured prediction task and in non-convex continuous control. In the control setting we show on the simulated cheetah and walker tasks that we can embed their optimal action sequences with DCEM and then use policy optimization to fine-tune components of the controller as a step towards combining model-based and model-free RL.

READ FULL TEXT

page 7

page 8

page 17

page 19

page 20

page 24

page 25

research
06/22/2020

Non-convex Optimization via Adaptive Stochastic Search for End-to-End Learning and Control

In this work we propose the use of adaptive stochastic search as a build...
research
06/09/2020

Variational Model-based Policy Optimization

Model-based reinforcement learning (RL) algorithms allow us to combine m...
research
02/24/2021

Safe Learning-based Gradient-free Model Predictive Control Based on Cross-entropy Method

In this paper, a safe and learning-based control framework for model pre...
research
05/27/2021

Enhancing the performance of a bistable energy harvesting device via the cross-entropy method

This work deals with the solution of a non-convex optimization problem t...
research
09/18/2020

Cross-Entropy Method Variants for Optimization

The cross-entropy (CE) method is a popular stochastic method for optimiz...
research
12/14/2021

CEM-GD: Cross-Entropy Method with Gradient Descent Planner for Model-Based Reinforcement Learning

Current state-of-the-art model-based reinforcement learning algorithms u...
research
03/16/2017

End-to-End Learning for Structured Prediction Energy Networks

Structured Prediction Energy Networks (SPENs) are a simple, yet expressi...

Please sign up or login with your details

Forgot password? Click here to reset