Multi-objective evolution for Generalizable Policy Gradient Algorithms

04/08/2022
by   Juan Jose Garau Luis, et al.
0

Performance, generalizability, and stability are three Reinforcement Learning (RL) challenges relevant to many practical applications in which they present themselves in combination. Still, state-of-the-art RL algorithms fall short when addressing multiple RL objectives simultaneously and current human-driven design practices might not be well-suited for multi-objective RL. In this paper we present MetaPG, an evolutionary method that discovers new RL algorithms represented as graphs, following a multi-objective search criteria in which different RL objectives are encoded in separate fitness scores. Our findings show that, when using a graph-based implementation of Soft Actor-Critic (SAC) to initialize the population, our method is able to find new algorithms that improve upon SAC's performance and generalizability by 3 respectively, and reduce instability up to 65 graph structure of the best algorithms in the population and offer an interpretation of specific elements that help trading performance for generalizability and vice versa. We validate our findings in three different continuous control tasks: RWRL Cartpole, RWRL Walker, and Gym Pendulum.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/27/2019

Generalized Off-Policy Actor-Critic

We propose a new objective, the counterfactual objective, unifying exist...
research
12/06/2022

Q-Pensieve: Boosting Sample Efficiency of Multi-Objective RL Through Memory Sharing of Q-Snapshots

Many real-world continuous control problems are in the dilemma of weighi...
research
02/08/2023

A Scale-Independent Multi-Objective Reinforcement Learning with Convergence Analysis

Many sequential decision-making problems need optimization of different ...
research
02/22/2022

Behaviour-Diverse Automatic Penetration Testing: A Curiosity-Driven Multi-Objective Deep Reinforcement Learning Approach

Penetration Testing plays a critical role in evaluating the security of ...
research
07/24/2020

Evolve To Control: Evolution-based Soft Actor-Critic for Scalable Reinforcement Learning

Advances in Reinforcement Learning (RL) have successfully tackled sample...
research
05/24/2019

Rethinking Expected Cumulative Reward Formalism of Reinforcement Learning: A Micro-Objective Perspective

The standard reinforcement learning (RL) formulation considers the expec...
research
03/02/2023

Reinforcement Learning Guided Multi-Objective Exam Paper Generation

To reduce the repetitive and complex work of instructors, exam paper gen...

Please sign up or login with your details

Forgot password? Click here to reset