Interpretable Policies for Reinforcement Learning by Genetic Programming

12/12/2017
by   Daniel Hein, et al.
0

The search for interpretable reinforcement learning policies is of high academic and industrial interest. Especially for industrial systems, domain experts are more likely to deploy autonomously learned controllers if they are understandable and convenient to evaluate. Basic algebraic equations are supposed to meet these requirements, as long as they are restricted to an adequate complexity. Here we introduce the genetic programming for reinforcement learning (GPRL) approach based on model-based batch reinforcement learning and genetic programming, which autonomously learns policy equations from pre-existing default state-action trajectory samples. GPRL is compared to a straight-forward method which utilizes genetic programming for symbolic regression, yielding policies imitating an existing well-performing, but non-interpretable policy. Experiments on three reinforcement learning benchmarks, i.e., mountain car, cart-pole balancing, and industrial benchmark, demonstrate the superiority of our GPRL approach compared to the symbolic regression method. GPRL is capable of producing well-performing interpretable reinforcement learning policies from pre-existing default trajectory data.

READ FULL TEXT
research
04/29/2018

Generating Interpretable Fuzzy Controllers using Particle Swarm Optimization and Genetic Programming

Autonomously training interpretable control strategies, called policies,...
research
07/20/2020

Interpretable Control by Reinforcement Learning

In this paper, three recently introduced reinforcement learning (RL) met...
research
08/30/2021

Trustworthy AI for Process Automation on a Chylla-Haase Polymerization Reactor

In this paper, genetic programming reinforcement learning (GPRL) is util...
research
04/06/2018

Programmatically Interpretable Reinforcement Learning

We study the problem of generating interpretable and verifiable policies...
research
05/20/2017

Batch Reinforcement Learning on the Industrial Benchmark: First Experiences

The Particle Swarm Optimization Policy (PSO-P) has been recently introdu...
research
08/26/2022

Symbolic Explanation of Affinity-Based Reinforcement Learning Agents with Markov Models

The proliferation of artificial intelligence is increasingly dependent o...
research
12/30/2022

Symbolic Visual Reinforcement Learning: A Scalable Framework with Object-Level Abstraction and Differentiable Expression Search

Learning efficient and interpretable policies has been a challenging tas...

Please sign up or login with your details

Forgot password? Click here to reset