Novel Policy Seeking with Constrained Optimization

05/21/2020
by   Hao Sun, et al.
6

In this work, we address the problem of learning to seek novel policies in reinforcement learning tasks. Instead of following the multi-objective framework used in previous methods, we propose to rethink the problem under a novel perspective of constrained optimization. We first introduce a new metric to evaluate the difference between policies, and then design two practical novel policy seeking methods following the new perspective, namely the Constrained Task Novel Bisector (CTNB), and the Interior Policy Differentiation (IPD), corresponding to the feasible direction method and the interior point method commonly known in constrained optimization problems. Experimental comparisons on the MuJuCo control suite show our methods achieve substantial improvements over previous novelty-seeking methods in terms of both novelty and primal task performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/19/2019

QXplore: Q-learning Exploration by Maximizing Temporal Difference Error

A major challenge in reinforcement learning for continuous state-action ...
research
04/20/2022

Memory-Constrained Policy Optimization

We introduce a new constrained optimization method for policy gradient r...
research
01/28/2022

Constrained Variational Policy Optimization for Safe Reinforcement Learning

Safe reinforcement learning (RL) aims to learn policies that satisfy cer...
research
02/28/2016

Investigating practical linear temporal difference learning

Off-policy reinforcement learning has many applications including: learn...
research
06/14/2021

Variational Policy Search using Sparse Gaussian Process Priors for Learning Multimodal Optimal Actions

Policy search reinforcement learning has been drawing much attention as ...
research
12/30/2022

Superiorization: The asymmetric roles of feasibility-seeking and objective function reduction

The superiorization methodology can be thought of as lying conceptually ...
research
02/20/2019

World Discovery Models

As humans we are driven by a strong desire for seeking novelty in our wo...

Please sign up or login with your details

Forgot password? Click here to reset