Interpretable Random Forests via Rule Extraction

04/29/2020
by   Clément Bénard, et al.
0

We introduce SIRUS (Stable and Interpretable RUle Set) for regression, a stable rule learning algorithm which takes the form of a short and simple list of rules. State-of-the-art learning algorithms are often referred to as "black boxes" because of the high number of operations involved in their prediction process. Despite their powerful predictivity, this lack of interpretability may be highly restrictive for applications with critical decisions at stake. On the other hand, algorithms with a simple structure-typically decision trees, rule algorithms, or sparse linear models-are well known for their instability. This undesirable feature makes the conclusions of the data analysis unreliable and turns out to be a strong operational limitation. This motivates the design of SIRUS, which combines a simple structure with a remarkable stable behavior when data is perturbed. The algorithm is based on random forests, the predictive accuracy of which is preserved. We demonstrate the efficiency of the method both empirically (through experiments) and theoretically (with the proof of its asymptotic stability). Our R/C++ software implementation sirus is available from CRAN.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/19/2019

SIRUS: making random forests interpretable

State-of-the-art learning algorithms, such as random forests or neural n...
research
03/04/2021

Learning Accurate and Interpretable Decision Rule Sets from Neural Networks

This paper proposes a new paradigm for learning a set of independent log...
research
08/11/2022

RandomSCM: interpretable ensembles of sparse classifiers tailored for omics data

Background: Understanding the relationship between the Omics and the phe...
research
05/12/2014

Consistency of random forests

Random forests are a learning algorithm proposed by Breiman [Mach. Learn...
research
02/08/2022

Is interpolation benign for random forests?

Statistical wisdom suggests that very complex models, interpolating trai...
research
10/23/2020

An Analysis of LIME for Text Data

Text data are increasingly handled in an automated fashion by machine le...
research
10/16/2018

Refining interaction search through signed iterative Random Forests

Advances in supervised learning have enabled accurate prediction in biol...

Please sign up or login with your details

Forgot password? Click here to reset