Maximum Entropy Differential Dynamic Programming

10/13/2021
by   Oswin So, et al.
0

In this paper, we present a novel maximum entropy formulation of the Differential Dynamic Programming algorithm and derive two variants using unimodal and multimodal value functions parameterizations. By combining the maximum entropy Bellman equations with a particular approximation of the cost function, we are able to obtain a new formulation of Differential Dynamic Programming which is able to escape from local minima via exploration with a multimodal policy. To demonstrate the efficacy of the proposed algorithm, we provide experimental results using four systems on tasks that are represented by cost functions with multiple local minima and compare them against vanilla Differential Dynamic Programming. Furthermore, we discuss connections with previous work on the linearly solvable stochastic control framework and its extensions in relation to compositionality.

READ FULL TEXT

page 5

page 6

research
01/30/2022

Multimodal Maximum Entropy Dynamic Games

Environments with multi-agent interactions often result a rich set of mo...
research
01/18/2023

DDPEN: Trajectory Optimisation With Sub Goal Generation Model

Differential dynamic programming (DDP) is a widely used and powerful tra...
research
05/14/2018

Maximum Entropy Interval Aggregations

Given a probability distribution p = (p_1, ..., p_n) and an integer 1≤ ...
research
04/07/2022

Parameterized Differential Dynamic Programming

Differential Dynamic Programming (DDP) is an efficient trajectory optimi...
research
09/14/2021

Performance of a Markovian neural network versus dynamic programming on a fishing control problem

Fishing quotas are unpleasant but efficient to control the productivity ...
research
08/29/2018

Victory Probability in the Fire Emblem Arena

We demonstrate how to efficiently compute the probability of victory in ...
research
12/02/2021

A Practical Dynamic Programming Approach to Datalog Provenance Computation

We establish a translation between a formalism for dynamic programming o...

Please sign up or login with your details

Forgot password? Click here to reset