-
Wasserstein Reinforcement Learning
We propose behavior-driven optimization via Wasserstein distances (WDs) ...
read it
-
Stochastically Dominant Distributional Reinforcement Learning
We describe a new approach for mitigating risk in the Reinforcement Lear...
read it
-
Efficient Wasserstein Natural Gradients for Reinforcement Learning
A novel optimization approach is proposed for application to policy grad...
read it
-
Gradient Flows in Dataset Space
The current practice in machine learning is traditionally model-centric,...
read it
-
The Importance of Pessimism in Fixed-Dataset Policy Optimization
We study worst-case guarantees on the expected return of fixed-dataset p...
read it
-
On Wasserstein Reinforcement Learning and the Fokker-Planck equation
Policy gradients methods often achieve better performance when the chang...
read it
-
Regularization Matters in Policy Optimization
Deep Reinforcement Learning (Deep RL) has been receiving increasingly mo...
read it
Policy Optimization as Wasserstein Gradient Flows
Policy optimization is a core component of reinforcement learning (RL), and most existing RL methods directly optimize parameters of a policy based on maximizing the expected total reward, or its surrogate. Though often achieving encouraging empirical success, its underlying mathematical principle on policy-distribution optimization is unclear. We place policy optimization into the space of probability measures, and interpret it as Wasserstein gradient flows. On the probability-measure space, under specified circumstances, policy optimization becomes a convex problem in terms of distribution optimization. To make optimization feasible, we develop efficient algorithms by numerically solving the corresponding discrete gradient flows. Our technique is applicable to several RL settings, and is related to many state-of-the-art policy-optimization algorithms. Empirical results verify the effectiveness of our framework, often obtaining better performance compared to related algorithms.
READ FULL TEXT
Comments
There are no comments yet.