GAM(L)A: An econometric model for interpretable Machine Learning

03/17/2022
by   Emmanuel Flachaire, et al.
0

Despite their high predictive performance, random forest and gradient boosting are often considered as black boxes or uninterpretable models which has raised concerns from practitioners and regulators. As an alternative, we propose in this paper to use partial linear models that are inherently interpretable. Specifically, this article introduces GAM-lasso (GAMLA) and GAM-autometrics (GAMA), denoted as GAM(L)A in short. GAM(L)A combines parametric and non-parametric functions to accurately capture linearities and non-linearities prevailing between dependent and explanatory variables, and a variable selection procedure to control for overfitting issues. Estimation relies on a two-step procedure building upon the double residual method. We illustrate the predictive performance and interpretability of GAM(L)A on a regression and a classification problem. The results show that GAM(L)A outperforms parametric models augmented by quadratic, cubic and interaction effects. Moreover, the results also suggest that the performance of GAM(L)A is not significantly different from that of random forest and gradient boosting.

READ FULL TEXT

page 25

page 31

research
07/28/2020

Surrogate Locally-Interpretable Models with Supervised Machine Learning Algorithms

Supervised Machine Learning (SML) algorithms, such as Gradient Boosting,...
research
05/11/2020

Interpretable random forest models through forward variable selection

Random forest is a popular prediction approach for handling high dimensi...
research
09/29/2020

Selective Cascade of Residual ExtraTrees

We propose a novel tree-based ensemble method named Selective Cascade of...
research
05/04/2023

Using interpretable boosting algorithms for modeling environmental and agricultural data

We describe how interpretable boosting algorithms based on ridge-regular...
research
06/02/2018

Locally Interpretable Models and Effects based on Supervised Partitioning (LIME-SUP)

Supervised Machine Learning (SML) algorithms such as Gradient Boosting, ...
research
01/29/2017

Random Forest regression for manifold-valued responses

An increasing array of biomedical and computer vision applications requi...
research
09/12/2021

Automatic Componentwise Boosting: An Interpretable AutoML System

In practice, machine learning (ML) workflows require various different s...

Please sign up or login with your details

Forgot password? Click here to reset