How Much Can We See? A Note on Quantifying Explainability of Machine Learning Models

by   Gero Szepannek, et al.

One of the most popular approaches to understanding feature effects of modern black box machine learning models are partial dependence plots (PDP). These plots are easy to understand but only able to visualize low order dependencies. The paper is about the question 'How much can we see?': A framework is developed to quantify the explainability of arbitrary machine learning models, i.e. up to what degree the visualization as given by a PDP is able to explain the predictions of the model. The result allows for a judgement whether an attempt to explain a black box model is sufficient or not.



There are no comments yet.


page 5


Explainable Deep Modeling of Tabular Data using TableGraphNet

The vast majority of research on explainability focuses on post-explaina...

Promises and Pitfalls of Black-Box Concept Learning Models

Machine learning models that incorporate concept learning as an intermed...

RuleMatrix: Visualizing and Understanding Classifiers with Rules

With the growing adoption of machine learning techniques, there is a sur...

Case-Based Reasoning for Assisting Domain Experts in Processing Fraud Alerts of Black-Box Machine Learning Models

In many contexts, it can be useful for domain experts to understand to w...

Explaining Differences in Classes of Discrete Sequences

While there are many machine learning methods to classify and cluster se...

Explainable Machine Learning for Fraud Detection

The application of machine learning to support the processing of large d...

RelEx: A Model-Agnostic Relational Model Explainer

In recent years, considerable progress has been made on improving the in...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In the recent past a considerable number of auto machine learning frameworks such as H2O, auto-sklearn (Feurer et al., 2015) or mlr3 (Bischl et al., 2016)

have been developed and made publicly available and thus simplify creation of complex machine learning models. On the other hand, advances in hardware technology allow these models to get more and more complex with huge numbers of parameters such as deep learning models

(cf. e.g. LeCun et al., 2015). Properly parameterized modern ML algorithms are often of superior predictive accuracy.

The popularity of modern ML algorithms is based on the fact that they are very flexible with regard to to detection of complex nonlinear high dimensional multivariate dependencies without the need for an explicit specification of the type of the functional relationship of the dependency. As a consequence the resulting models are often called to be of black box nature which has led to an increasing need of tools for their interpretation.

Depending on the context, there are different requirements to explainability (cf. e.g. Biecek, 2018; Szepannek and Aschenbruck, 2019) given by different targets of explanation such as explanations of predictions for individual observations (Ribeiro et al., 2016; Štrumbelj and Kononenko, 2014; Lundberg and Lee, 2017; Staniak and Biecek, 2018), importance of features (Breiman, 2001; Casalicchio et al., 2018) and feature effects (Friedman, 2001; Apley, 2016; Goldstein et al., 2015).

This paper concentrates on the latter: feature effects do investigate the dependency of the predictions by a model on one (or several) predictors. Molnar et al. (2019) work out that superior performance comes along with the ability to model nonlinear high oder dependencies which are naturally hard to understand for humans. As a remedy, criteria are developed in order to quantify the interpretability of a model and in consequence allow for multi-objective optimization of the model selection process with respect to both: predictive performance and interpretability.

The approach in this paper is somewhat different: Starting with any model (which is often the best one in terms of predcitive accuracy) one of the most popular approaches to understanding of feature effects are partial dependence plots (PDP) which are introduced in Section 2. Partial dependence plots are easy to understand but only able to visualize low order dependencies. The question that is asked in this paper is ”How much can we see?”: In Section 3 a framework is developed to quantify the explainability of a model, i.e. up to what degree the visualization as given by a PDP is able to explain a model. This allows us to judge whether an attempt to explain the predictions of a model is sufficient or not. In Section 4 the approach is demonstrated on two examples uing both artificial as well as real-world data and finally, a summary and an outlook are given in Section 5.

2 Partial Dependence

Partial depencence plots (PDP Friedman, 2001) are a model-agnostic approach in order to understand feature effects and are applicable to arbitrary models, here denoted by

. The vector of predictor variables

is further subdivided into two subsets: and . The partial dependence function is given by


i.e. it computes the average prediction given the variable subset takes the values

. In practise, the partial dependence curve is estimated by


Note that for the partial dependence function corresponds to and in the extreme, for the variable subset , i.e. , this will end up in:


which is independent of and corresponds to the constant average prediction of the model estimated by:


3 Explainability

Figure 1: PDP for variable (left) and match of partial dependence function and predicted values .

In the rest of the paper a measure is defined in order to quantify up to what degree this visualization as given by a PDP is able to explain a model. As an introductory example consider simulated data of two independent random variables

and a dependent variable according to the data generating process:


with ,

and a standard normally distributed error term (note that the error term could also be omitted, here).

depends linearly on and

. Afterwards a default random forest model

(using both variables and and the R package randomForest Liaw and Wiener, 2002) is computed. Figure 1 (left) shows the corresponding partial dependence plot for variable together with the predictions for all observations. It can be recognized that – of course – the PDP does not exactly match the predictions. In Figure 1 (right) the x-axis is changed: here, the predictions of the model (x-axis) are plotted against their corresponding values of the partial dependence function (y-axis). The better the PDP would represent the model the closer the points should be to the diagonal.

Figure 2: Match of partial dependence function and predicted values for the first example (left). The plot on the right illustrates, that for a 2D-PDP using all input variables a perfect match is obtained.

A first step towards defining explainability consists in answering the question: How close is what I see to the true predictions of the model? For this reason, a starting point for further analysis is given by computing the differences between the partial dependence function and the model’s predictions. A natural approach to quantifiying these differences is given by computing the expected squared difference:


which can be empirically estimated by:


Remarkably the does not calculate the error between model’s predictions and the obervations but between the partial dependence function and the model’s predictions here. Further, in order to benchmark the of a partial dependence function it can be compared to the of the naive constant average prediction :


and its empirical estimate:


Finally one can relate both and and define explainability of any black box model by a partial dependence function by the ratio


similar to the common goodness of fit statistic. An close to 1 means the a model is well represented by a PDP and the smaller it is the less of the model’s predictions are explained.

4 Examples

Starting again with the introductory example from the previous Section. From data generation the coice of results in a higher variation of with regard to . Accordingly, it can be expected that is closer to the model’s predictions than the (cf. Figure 2, left) and thus has a higher explainability. Computing both explainabilities confirms this: . For a two dimensional PDP the partial dependence function corresponds to the true predictions resulting in an explainability of 1, i.e. the model is perfectly explained by the partial dependence curve (Figure 2, right).

Figure 3: Most explainable PDP for a random forest model on the boston housing data (left) as well as match of preditions and PDP (right).
Variable Explainability Variable Explainability
lstat 0.512 age 0.018
rm 0.410 b 0.012
lon 0.085 chas 0.004
nox 0.056 zn 0.002
ptratio 0.056 lat 0.001
indus 0.046 rad -0.002
tax 0.030 dis -0.004
crim 0.025
Table 1: Explainability of 1D PDPs for a random forest model of the Boston housing data based on different variables.

As a example the popular boston housing real world data set (Dua and Graff, 2017) is used which has also been used by other authors (cf. e.g. Greenwell, 2017) in order to illustrate partial dependence plots. Again, a default random forest model has been built as in the example before. Figure 3 (left) shows the PDP for variable LSTAT. The corresponding explainability identifies this PDP to be the two most useful ones (cf. Table 1).

Nonetheless, from the explainabilities of all single variable’s partial dependence functions it is also obvious that considering single PDPs alone is not sufficient to understand the behaviour of the model in this case.

Taking a closer look at the partial dependence function vs. the predicted values on the data set (Figure 3, left) shows further that e.g. for large values of the variable LSTAT the partial dependence function appears to systematically overestimate the predictions for this example. This means, that large values of LSTAT tend to fall together with values in (interact with) other variables which decrease predictions relative to the distribution of the data within these variables ((cf. also Apley, 2016).

Investigation of the explainabilities may turn out that the true predictions may still differ a lot from what we can see in the partial dependence plot. Comparison of Figure 3 (right) and Figure 4, (right) illustrates that the two dimensional PDP of the two most explainable variables LSTAT and RM (Figure 4, left) is much more explainable here (). Note that although it is principally possible to compute partial dependence for vectors of any dimension its visualization is restricted to which no limit for the proposed visualization of the match between and as the one in Figure 4 (right).

Figure 4: Two dimensional PDP for the variables variables LSTAT and RM of a random forest on the boston housing data (left) and the corresponding match of PDP and preditions(right).

5 Summary

Partial dependence plots as one of the most common tools to explain feature effects of black box machine learning models are investigated with regard to the extent that they are able to explain a model’s predictions.

Using differences between the predictions of the model and their corresponding values of a partial dependence function a framework has been developed to quantify the how well a PDP is able to explain the underlying model. The result in terms of the measure of explainability allows to assess whether the explanation of a black box model may be sufficient or not.

Two simple examples have been presented in order to illustrate the concept if explainability. It can be seen that looking at PDPs is not necessarily sufficient to understand a model’s behaviour. As an open issue it has to be noted that although PDP visualizations are restricted to dimensions lower or equal than two of course the models in general use more than two variables. Of course, analysts are able to look at several PDPs at the same time but up to our knowledge no literature is available howfar humans are able to combine information of more than two PDPs in order to get a clearer picture of a model’s behaviour which could be a topic of future research.


  • D. Apley (2016)

    Visualizing the effects of predictor variables in black box supervised learning models

    External Links: Link Cited by: §1, §4.
  • P. Biecek (2018) DALEX: explainers for complex predictive models in R. Journal of Machine Learning Research 19 (84), pp. 1–5. Cited by: §1.
  • B. Bischl, M. Lang, L. Kotthoff, J. Schiffner, J. Richter, E. Studerus, G. Casalicchio, and Z. M. Jones (2016) mlr: machine learning in R. Journal of Machine Learning Research 17 (170), pp. 1–5. Cited by: §1.
  • L. Breiman (2001) Random forests. Machine Learning 45 (1), pp. 5–32. Cited by: §1.
  • G. Casalicchio, C. Molnar, and B. Bisch (2018) Visualizing the feature importance for black box models. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 655–670. Cited by: §1.
  • D. Dua and C. Graff (2017) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. External Links: Link Cited by: §4.
  • M. Feurer, A. Klein, K. Eggensperger, J. Springenberg, M. Blum, and F. Hutter (2015) Efficient and robust automated machine learning. In Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.), pp. 2962–2970. Cited by: §1.
  • J. Friedman (2001)

    Greedy function approximation: a gradient boosting machine

    Annals of Statistics 29, pp. 1189–1232. Cited by: §1, §2.
  • A. Goldstein, A. Kapelner, J. Bleich, and E. Pitkin (2015) Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation. Journal of Computational and Graphical Statistics 24 (1), pp. 44–65. External Links: Document Cited by: §1.
  • B. Greenwell (2017) pdp: An R Package for Constructing Partial Dependence Plots. The R Journal 9 (1), pp. 421–436. External Links: Document Cited by: §4.
  • Y. LeCun, Y. Bengio, and G. Hinton (2015) Deep learning. Nature 521, pp. 436–444. Cited by: §1.
  • A. Liaw and M. Wiener (2002) Classification and regression by randomforest. R News 2 (3), pp. 18–22. Cited by: §3.
  • S. M. Lundberg and S. Lee (2017) A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, pp. 4765–4774. Cited by: §1.
  • C. Molnar, G. Casalicchio, and B. Bischl (2019) Quantifiying interpretability of arbitrary machine learning models through functional decomposition. External Links: Link Cited by: §1.
  • M. T. Ribeiro, S. Singh, and C. Guestrin (2016)

    "Why should I trust you?": explaining the predictions of any classifier

    In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144. External Links: Document Cited by: §1.
  • M. Staniak and P. Biecek (2018) Explanations of Model Predictions with live and breakDown Packages. The R Journal 10 (2), pp. 395–409. External Links: Document, Link Cited by: §1.
  • E. Štrumbelj and I. Kononenko (2014) Explaining prediction models and individual predictions with feature contributions. Knowledge and Information Systems 41 (3), pp. 647–665. External Links: Document Cited by: §1.
  • G. Szepannek and R. Aschenbruck (2019)

    Predicting ebay prices: selecting and interpreting machine learning models – results of the AG DANK 2018 data science competition

    Archives of Data Science A, submitted. Cited by: §1.