Taylor Expansion Policy Optimization

03/13/2020
by   Yunhao Tang, et al.
37

In this work, we investigate the application of Taylor expansions in reinforcement learning. In particular, we propose Taylor expansion policy optimization, a policy optimization formalism that generalizes prior work (e.g., TRPO) as a first-order special case. We also show that Taylor expansions intimately relate to off-policy evaluation. Finally, we show that this new formulation entails modifications which improve the performance of several state-of-the-art distributed algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/26/2021

Hinge Policy Optimization: Rethinking Policy Improvement and Reinterpreting PPO

Policy optimization is a fundamental principle for designing reinforceme...
research
03/30/2022

Marginalized Operators for Off-policy Reinforcement Learning

In this work, we propose marginalized operators, a new class of off-poli...
research
06/11/2021

Taylor Expansion of Discount Factors

In practical reinforcement learning (RL), the discount factor used for e...
research
04/02/2018

Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments

In the NIPS 2017 Learning to Run challenge, participants were tasked wit...
research
07/25/2023

Offline Reinforcement Learning with On-Policy Q-Function Regularization

The core challenge of offline reinforcement learning (RL) is dealing wit...
research
08/27/2023

Distributional Off-Policy Evaluation for Slate Recommendations

Recommendation strategies are typically evaluated by using previously lo...
research
06/13/2012

New Techniques for Algorithm Portfolio Design

We present and evaluate new techniques for designing algorithm portfolio...

Please sign up or login with your details

Forgot password? Click here to reset