Optimality-based Analysis of XCSF Compaction in Discrete Reinforcement Learning

09/03/2020
by   Jordan T. Bishop, et al.
5

Learning classifier systems (LCSs) are population-based predictive systems that were originally envisioned as agents to act in reinforcement learning (RL) environments. These systems can suffer from population bloat and so are amenable to compaction techniques that try to strike a balance between population size and performance. A well-studied LCS architecture is XCSF, which in the RL setting acts as a Q-function approximator. We apply XCSF to a deterministic and stochastic variant of the FrozenLake8x8 environment from OpenAI Gym, with its performance compared in terms of function approximation error and policy accuracy to the optimal Q-functions and policies produced by solving the environments via dynamic programming. We then introduce a novel compaction algorithm (Greedy Niche Mass Compaction - GNMC) and study its operation on XCSF's trained populations. Results show that given a suitable parametrisation, GNMC preserves or even slightly improves function approximation error while yielding a significant reduction in population size. Reasonable preservation of policy accuracy also occurs, and we link this metric to the commonly used steps-to-goal metric in maze-like environments, illustrating how the metrics are complementary rather than competitive.

READ FULL TEXT
research
07/13/2022

Self-Play PSRO: Toward Optimal Populations in Two-Player Zero-Sum Games

In competitive two-agent environments, deep reinforcement learning (RL) ...
research
10/06/2021

Nested Policy Reinforcement Learning

Off-policy reinforcement learning (RL) has proven to be a powerful frame...
research
06/02/2020

Learning optimal environments using projected stochastic gradient ascent

In this work, we generalize the direct policy search algorithms to an al...
research
05/17/2023

Pittsburgh Learning Classifier Systems for Explainable Reinforcement Learning: Comparing with XCS

Interest in reinforcement learning (RL) has recently surged due to the a...
research
05/13/2022

Upside-Down Reinforcement Learning Can Diverge in Stochastic Environments With Episodic Resets

Upside-Down Reinforcement Learning (UDRL) is an approach for solving RL ...
research
12/17/2018

Malthusian Reinforcement Learning

Here we explore a new algorithmic framework for multi-agent reinforcemen...

Please sign up or login with your details

Forgot password? Click here to reset