DeepAI AI Chat
Log In Sign Up

Taylor Expansion Policy Optimization

03/13/2020
by   Yunhao Tang, et al.
37

In this work, we investigate the application of Taylor expansions in reinforcement learning. In particular, we propose Taylor expansion policy optimization, a policy optimization formalism that generalizes prior work (e.g., TRPO) as a first-order special case. We also show that Taylor expansions intimately relate to off-policy evaluation. Finally, we show that this new formulation entails modifications which improve the performance of several state-of-the-art distributed algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

10/26/2021

Hinge Policy Optimization: Rethinking Policy Improvement and Reinterpreting PPO

Policy optimization is a fundamental principle for designing reinforceme...
03/30/2022

Marginalized Operators for Off-policy Reinforcement Learning

In this work, we propose marginalized operators, a new class of off-poli...
06/11/2021

Taylor Expansion of Discount Factors

In practical reinforcement learning (RL), the discount factor used for e...
07/25/2023

Offline Reinforcement Learning with On-Policy Q-Function Regularization

The core challenge of offline reinforcement learning (RL) is dealing wit...
08/27/2023

Distributional Off-Policy Evaluation for Slate Recommendations

Recommendation strategies are typically evaluated by using previously lo...
06/13/2012

New Techniques for Algorithm Portfolio Design

We present and evaluate new techniques for designing algorithm portfolio...