Riemannian Proximal Policy Optimization

05/19/2020
by   Shijun Wang, et al.
8

In this paper, We propose a general Riemannian proximal optimization algorithm with guaranteed convergence to solve Markov decision process (MDP) problems. To model policy functions in MDP, we employ Gaussian mixture model (GMM) and formulate it as a nonconvex optimization problem in the Riemannian space of positive semidefinite matrices. For two given policy functions, we also provide its lower bound on policy improvement by using bounds derived from the Wasserstein distance of GMMs. Preliminary experiments show the efficacy of our proposed Riemannian proximal policy optimization algorithm.

READ FULL TEXT
research
05/19/2020

A Riemannian Primal-dual Algorithm Based on Proximal Operator and its Application in Metric Learning

In this paper, we consider optimizing a smooth, convex, lower semicontin...
research
12/12/2019

Provably Efficient Exploration in Policy Optimization

While policy-based reinforcement learning (RL) achieves tremendous succe...
research
10/28/2021

Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes

In applications of offline reinforcement learning to observational data,...
research
07/05/2020

Mission schedule of agile satellites based on Proximal Policy Optimization Algorithm

Mission schedule of satellites is an important part of space operation n...
research
07/02/2022

Geometric Learning of Hidden Markov Models via a Method of Moments Algorithm

We present a novel algorithm for learning the parameters of hidden Marko...
research
05/21/2018

A universal framework for learning based on the elliptical mixture model (EMM)

An increasing prominence of unbalanced and noisy data highlights the imp...
research
05/17/2023

Wasserstein Gradient Flows for Optimizing Gaussian Mixture Policies

Robots often rely on a repertoire of previously-learned motion policies ...

Please sign up or login with your details

Forgot password? Click here to reset