Optimal Decision Tree Policies for Markov Decision Processes

01/30/2023
by   Daniël Vos, et al.
0

Interpretability of reinforcement learning policies is essential for many real-world tasks but learning such interpretable policies is a hard problem. Particularly rule-based policies such as decision trees and rules lists are difficult to optimize due to their non-differentiability. While existing techniques can learn verifiable decision tree policies there is no guarantee that the learners generate a decision that performs optimally. In this work, we study the optimization of size-limited decision trees for Markov Decision Processes (MPDs) and propose OMDTs: Optimal MDP Decision Trees. Given a user-defined size limit and MDP formulation OMDT directly maximizes the expected discounted return for the decision tree using Mixed-Integer Linear Programming. By training optimal decision tree policies for different MDPs we empirically study the optimality gap for existing imitation learning techniques and find that they perform sub-optimally. We show that this is due to an inherent shortcoming of imitation learning, namely that complex policies cannot be represented using size-limited trees. In such cases, it is better to directly optimize the tree for expected return. While there is generally a trade-off between the performance and interpretability of machine learning models, we find that OMDTs limited to a depth of 3 often perform close to the optimal limit.

READ FULL TEXT

page 6

page 7

research
02/25/2021

Iterative Bounding MDPs: Learning Interpretable Policies via Non-Interpretable Methods

Current work in explainable reinforcement learning generally produces po...
research
06/25/2019

SOS: Safe, Optimal and Small Strategies for Hybrid Markov Decision Processes

For hybrid Markov decision processes, UPPAAL Stratego can compute strate...
research
10/21/2021

Interpretable Machine Learning for Resource Allocation with Application to Ventilator Triage

Rationing of healthcare resources is a challenging decision that policy ...
research
06/19/2019

Strategy Representation by Decision Trees with Linear Classifiers

Graph games and Markov decision processes (MDPs) are standard models in ...
research
05/04/2019

Optimal Resampling for Learning Small Models

Models often need to be constrained to a certain size for them to be con...
research
01/05/2016

Optimally Pruning Decision Tree Ensembles With Feature Cost

We consider the problem of learning decision rules for prediction with f...
research
08/29/2023

Probabilistic Dataset Reconstruction from Interpretable Models

Interpretability is often pointed out as a key requirement for trustwort...

Please sign up or login with your details

Forgot password? Click here to reset