Explaining black box decisions by Shapley cohort refinement

11/01/2019 ∙ by Masayoshi Mase, et al. ∙ 0

We introduce a variable importance measure to explain the importance of individual variables to a decision made by a black box function. Our measure is based on the Shapley value from cooperative game theory. Measures of variable importance usually work by changing the value of one or more variables with the others held fixed and then recomputing the function of interest. That approach is problematic because it can create very unrealistic combinations of predictors that never appear in practice or that were never present when the prediction function was being created. Our cohort refinement Shapley approach measures variable importance without using any data points that were not actually observed.



There are no comments yet.


page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Black box prediction models used in statistics, machine learning and artificial intelligence have been able to make increasingly accurate predictions, but it remains hard to understand those predictions. See for example,

Štrumbelj and Kononenko (2010, 2014), Ribeiro et al. (2016), Sundararajan and Najmi (2019) and the book of Molnar (2018).

Part of understanding predictions is understanding which variables are important. A variable could be important because changing it makes a causal difference, or because changing it makes a large change to our predictions or because leaving it out of a model reduces that model’s prediction accuracy (Jiang and Owen, 2003). Importance by one of these criteria need not imply importance by another, though additional assumptions may allow a causal implication to be made from one of the other measures (Pearl, 2009; Zhao and Hastie, 2019). We could be interested in variables that are important overall or in variables that explain one single prediction, such as why a given person was or was not approved for a loan, or why a given patient was or was not placed in an intensive care unit. We use the term impact for the quantitiative change in a prediction that can be attributed to a variable. This impact can be positive or negative. Importance is then about the absolute value of the impact being large or relatively large.

In this paper, we measure the impact of individual predictor variables used by a model, in order to explain why a given prediction was made. Because we are explaining a given prediction from a given model, we do not address whether that prediction had a sound causal basis. Sound or unsound, we want to understand why it occured, and that understanding might even lead us to conclude that a model is unsound. We also do not consider what the effect of retraining with a different set of predictors would have been, because those differently trained models were not the ones that made the decision.

To fix ideas, we suppose that the decision for a target subject

was based on a vector

of different predictor variables after training on a data set from subjects. We will speak of predictors rather than features because features can be constructed as transformations of one or more predictors and our main interest is providing an explanation in terms of the originally measured quantities. While it may be reasonable to make separate attributions for say and , we leave that for later work. The predictors may be real or categorical among other possibilities. The prediction for subject is for a function of a potentially quite complicated form. In the case of a loan,

might be a binary variable indicating

if the loan should be made to a subject with predictor vector , and

otherwise. Or it could be an estimate of the probability that the loan will be repaid, or an estimate of expected return to the lender for making this loan, taking account of administrative costs, default possibilities, the outlook for interest rates, and so on.

Even with the entire function at our disposal in software, it can still be a challenge to quantify a variable’s impact. There may be numerous combinations of counterfactual predictors that could have changed the prediction. The problem of computing the importance of inputs to a function comes up frequently in global sensitivity analysis (Saltelli et al., 2008). Then pick-freeze methods that change some but not all components of and track how changes are the norm (Sobol’, 1993; Gamboa et al., 2016). There, one usually assumes that the input variables are statistically independent of each other, and even then the problem is challenging. Black box prediction functions are usually fit to predictors related by complicated dependence patterns, and then predictor independence is extremely unrealistic. Changing some predictors independently of others can lead to predictor combinations far from anything that has been seen in the training data (e.g., a home with many rooms but few square feet) or even impossible combinations (e.g., birth date after graduation date). Those cases are not ones where we can expect the fitted model to be valuable, causing us to doubt that they belong in the explanation.

Our approach does not use any variable combinations that never arose in the sample. Instead, for each predictor, every subject in the data set is either similar to the target subject or not similar. Ways to define similarity are discussed below. Given predictors, there are different sets of predictors on which subjects can be similar to the target. We form different cohorts of subjects, each consisting of subjects similar to the target on a subset of predictors, without regard to whether they are also similar on any of the other predictors. At one extreme is a set of all predictors, and a cohort that is similar to the target in every way. At the other extreme, the empty predictor set yields the set of all subjects.

We can refine the grand cohort of all subjects towards the target subject by removing subjects that mismatch the target on one or more predictors. The predictors that change the cohort mean the most when we restrict to similar subjects, are the ones that we take to be the most important in explaining why the target subject’s prediction is different from that of the other subjects.

We will define the impact of a variable through the Shapley value. Shapley value has been used in model explanation for machine learning (Štrumbelj and Kononenko, 2010, 2014; Lundberg and Lee, 2017; Sundararajan and Najmi, 2019) and for computer experiments (Owen, 2014; Song et al., 2016; Owen and Prieur, 2017). See Sundararajan and Najmi (2019) for a survey. We will present Shapley value before defining our measures. We call this approach cohort refinement Shapley, or cohort Shapley (CS) for short.

The closest method to our proposal is baseline Shapley from Sundararajan and Najmi (2019). Baseline Shapley compares the predictions for a target subject with predictors to the predictions from a baseline predictor vector . There is not necessarily a subject whose predictors are . We can make changes to some predictors replacing them by the corresponding values from the target subject , and recording how changes. Baseline Shapley can construct and use improbable or even impossible combinations of predictors, as the authors note, while cohort Shapley does not.

Sundararajan and Najmi (2019) mention a second problem with baseline Shapley. It arises when two predictors are highly correlated. Consider an extreme case where for all subjects and two different predictors and . The prediction function might use these two predictors equally or it might make an arbitrary choice to use one and completely ignore the other, or the precise combination could be in between these extremes in some very complicated way. The importance of predictors and from baseline Shapley will then depend on those choices because they affect the value that will take on a hypothetical point where holds. For CS, let us assume that for any subject , and two equivalent predictors and , we will have similar to if and only if is similar to . In that case we will find that predictors and get equal Shapley values, even if the model ignores one of them.

This paper is organized as follows. Section 2 gives our notation, and reviews Shapley value, the functional ANOVA decomposition, and the anchored decomposition. Section 3 defines similarity and similarity-based cohorts, with a small example to illustrate those sets. Section 4 presents cohort Shapley importance measures. Section 5 describes a game theoretic way to aggregate impacts over a set of target subjects, such as all subjects in the data set. Section 6 shows cohort Shapley on some real data sets. Section 7 discusses strengths and weaknesses of cohort Shapley and also how it addresses a different goal than baseline Shapley does.

2 Notation and background

The predictor vector for subject is where is the number of predictors in the model. Each belongs to a set which may consist of real or binary variables or some other types. There is a black box function that is used to predict an outcome for a subject with predictor vector . We write and . There is a target subject and we would like an explanation about which predictors are the most important determinants of . We assume that is in the set of subjects with available data, although this subject might not have been used in training the model.

The set is denoted by . We will need to manipulate subsets of . For we let be its cardinality. The complementary set is denoted by , especially in subscripts. Sometimes a point must be created by combining parts of two other points. The point has for and for . Furthermore, we sometimes use and in place of the more cumbersome and . For instance, is what we get by replacing the ’th input to by .

2.1 Shapley value

Shapley value (Shapley, 1952) is used in game theory to define a fair allocation of rewards to a team that has cooperated to produce something of value. Suppose that a team of people produce a value , and that we have at our disposal the value that would have been produced by the team , for all teams , including . Let be the reward for player .

Shapley introduced quite reasonable criteria:

  1. Efficiency: .

  2. Symmetry: If for all , then .

  3. Dummy: if for all , then .

  4. Additivity: if and lead to values and then the game producing has values .

He found that the unique valuation that satisfies all four of these criteria is


Formula (1) is not very intuitive. Another way to explain Shapley value is as follows. We could build a team from to in steps, adding one member at a time. There are different orders in which to add team members. The Shapley value is the increase in value coming from the addition of member , averaged over those different orders.

2.2 Function decompositions

Function decompositions, also called high dimensional model representations (HDMR), write a function of inputs as a sum of functions, each of which depend only on one of the subsets of inputs. Because and have other uses in this paper we present the decomposition for . Let be a function of with . In these decompositions we write

where depends on only through . Many such decompositions are possible (Kuo et al., 2010).

The best known decomposition is the analysis of variance (ANOVA) decomposition. It applies to random

with independent components . If , then we write

for non-empty . The effects are mutually orthogonal in that for subsets , we have Letting , it follows from orthogonality that

We can recover effects from conditional expectations, via inclusion-exclusion,


See Owen (2013) for history and derivations of this functional ANOVA.

We will need the anchored decomposition, which goes back at least to Sobol’ (1969). It is also called cut-HDMR (Aliş and Rabitz, 2001) in chemistry, and finite differences-HDMR in global sensitivity analysis (Sobol’, 2003). We begin by picking a reference point called the anchor, with for . The anchored decomposition is

We have replaced averaging over by plugging in the anchor value via . If and , then . We do not need independence of the , or even randomness for them and we do not need mean squares. What we need is that when is defined, so is for any .

The main effect in an anchored decomposition is and the two factor term for indices is

For instance if and , then

The version of (2) for the anchored decomposition is

as shown by Kuo et al. (2010).

2.3 Shapley for function decompositions

To get a Shapley value for predictor variables, we must first define the value produced by a subset of them. The approach of Štrumbelj and Kononenko (2010) begins with a vector of independent random predictors from some distribution

. They used independent predictors uniformly distributed over finite discrete sets but they could as well be countable or continuous and non-uniform, so long as they are independent. For a target subject

, let be the prediction for that subject. They define the value of the predictor set by

with expectations taken under . In words, is the expected change in our predictions at a random point that comes from specifying that for , while leaving random for .

In their formulation, the total value to be explained is

the extent to which differs from a hypothetical average prediction over independent predictors. The subset explains , and from that they derive Shapley value. They define quantities via and

They prove that the Shapley value for predictor is

We give a proof of this in Section 4, different from theirs, making use of the anchored decomposition. While their

is defined via expectations of independent random variables, their Shapley value comes via the anchored decomposition applied to those expectations.

A second approach to Shapley value for the ANOVA is to define the value of the set to be the variance explained by those predictors, . With this definition, the Shapley value for is


See Owen (2014). For Shapley value based on variance explained by dependent inputs, see Song et al. (2016) and Owen and Prieur (2017).

3 Similarity-based cohorts

A cohort is a set of subjects. For the target subject , we will define a suite of cohorts consisting of subjects similar to in various ways. The subject will be in all of those cohorts. First we describe similarity.

3.1 Similarity

For each predictor , we define a target-specific similarity function . If , then subject is considered to be similar to subject as measured by predictor . Otherwise means that subject is dissimilar to subject for predictor . The simplest similarity is identity:

which is reasonable for binary predictors or those taking a small number of levels. For real-valued predictors, there may be no with and then we might instead take

where subject matter experts have chosen . Taking recovers the identity measure of similarity. The two similarity measures above generate an equivalence relation on , if does not depend on . In general, we do not need to be an equivalence. For instance, we do not need and would not necessarily have that if we used relative distance to define similarity, via

3.2 Cohorts of

We use to define our set of subjects. Let

with by convention. Then is the cohort of subjects that are similar to the target subject for all predictors but not necessarily similar for any predictors . These cohorts are never empty, because we always have . We write for the cardinality of the cohort.

Tables 1 and 2 show the cohort structure for a toy dataset with subjects, three predictors and a target subject . As the cardinality of increases, the cohort focusses in on the target subject. The subjects listed there could be generalized to groups of subjects who either match or don’t match a target subject from group for those three predictors. Then the cohorts would be unions of those groups. For instance, the cohort in Table 1 would become the union of groups and . Even if one or more of those groups were empty, none of the cohorts would be empty, again due to subject .

1 0 0 0
2 0 0 1
3 0 1 0
4 0 1 1
5 1 0 0
6 1 0 1
7 1 1 0
8 1 1 1
Table 1: A toy data set of subjects. For each of predictors, indicates whether a subject is similar to target subject on predictor .
Set Cohort
Table 2: The cohorts corresponding to sets of predictors shown in Table 1. To belong to the cohort for set , a subject must be similar to the target subject for all predictors .

Given a set of cohorts, we define cohort averages

Then the value of set is

where . The last equality follows because the cohort with is the whole data set. The total value to be explained is

It may well happen that is the singleton . In that case the total value to be explained is . In this and other settings some of the may be the average of a very small number of subjects’ predictions, and potentially poorly determined. We return to this point in Section 7.

4 Importance measures

For Shapley value, every variable is either ‘in or out’, and so binary variables underly the approach. Here we compute Shapley values based on function decompositions of a function defined on . The values of that function might themselves be expectations, like the cohort mean in cohort Shapley or the quantity in the approach of Štrumbelj and Kononenko (2010), but for our purposes here they are just numbers.

When the target point changes, then the Shapley value changes too. Sundararajan and Najmi (2019) consider the effects of continuously varying the target point and describe some invariance and monotonicity properties. For any fixed target and baseline , baseline Shapley is defined in terms of the binary variables we consider here.

We use to represent the binary vector of length with a one in position and zeroes elsewhere. This is the ’th standard basis vector. We then generalize it to for . An arbitrary point in is denoted by .

Let be a function on . In our applications, the total value to be explained is , with corresponding to matching the target in all ways and corresponding to no matches at all. The value contributed by is .

4.1 Shapley value via anchored decomposition on

Because we use the anchored decomposition for functions on instead of the ANOVA, we do not need to define a distribution for . The anchored decomposition on with anchor has a simple structure.

Lemma 1.

For integer , let have the anchored decomposition with anchor . Then


where .


The inclusion-exclusion formula for the binary anchored decomposition is

Suppose that for . Then, splitting up the alternating sum

because and are the same point when . It follows that if does not hold.

Now suppose that . First because only depends on through . From we have . Then , completing the proof. ∎

Now we find the Shapley value for a function on in an anchored decomposition. Štrumbelj and Kononenko (2010) proved this earlier using different methods.

Theorem 1.

Let have the anchored decomposition with terms for . Let the set contribute value . Then the total value is , and the Shapley value for variable is


For , , and so the two expressions for in (5) are equal. From the definition of Shapley value,


By Lemma 1,



The cardinality of for which (7) is nonzero ranges from to and so

because contains and additional indices from . Simplifying


by the “hockey-stick identity”. Therefore

The right hand side of (5) appears like it might not sum to . To verify that it does, write

The proof in Štrumbelj and Kononenko (2010) proceeds by substituting the inclusion-exclusion identity into the first expression for in (5) and then showing that it is equal to the definition of Shapley value. They also need to explain some of their steps in prose and the version above provides a more ‘mechanical’ alternative approach.

For we get, using inclusion-exclusion

This is the Shapley value for any numbers on the corners of provided that gets the value .

The expression in (4.1) is easier to interpret. It yields

Here is the difference that refining on variable makes when we have already refined on the variable set . Now is the average over cardinalities of the average of all such differences. The contribution from is given the same weight as the average of all contributions to for .

5 Aggregation

Given a set of per-subject Shapley values, we can explore them graphically and numerically to extract insights. One important task is to compare importance of predictors in aggregate over a set of subjects. While that can be done in numerous ways with summary statistics, such as average absolute Shapley value, we would prefer to derive an aggregate measure from a game so that the aggregate measure that satisfies the four Shapley criteria from Section 2.1.

Because Shapley value is additive over games, we could simply sum the per-subject Shapley values, that we now denote by . That will provide an unfortunate cancellation that we seek to avoid. To see the cancellation, suppose that predictor takes the values or and the value generally leads to a larger outcome . We will then tend to get positive cohort Shapley values when and negative ones otherwise. These effects will tend to cancel in , obscuring the impact of .

To avoid this cancellation we let

and define the value of set to be

This corresponds to a game with total value

When each , then the total value to be explained is . Then, for large , we explain , while for small we explain , so that in either case we are explaining . When , then we are explaining an effect of . That may still involve offsetting positive and negative effects, and not knowing a good sign to attrribute to them we count them as zero. The aggregate Shapley value of variable is then


by the additivity property.

This signed aggregation is not limited to cohort Shapley. It could also be applied to baseline Shapley. For baseline Shapley, we could aggregate over targets and/or baselines.

6 Examples

In this section we include some numerical examples of cohort Shapley. Section 6.1 computes CS for passengers for predicted probability of survival on the Titanic. It also computes some aggregate cohort Shapley values there. Section 6.2 computes CS for the Boston housing data and includes a comparison to baseline Shapley.

6.1 Titanic data

Here we consider a subset of the Titanic passenger dataset containing 887 individuals with complete records. This data has been used by Kaggle (see https://www.kaggle.com/c/titanic/data

) to illustrate machine learning. As the function of interest, we construct a logistic regression model which predicts ‘Survival’ based on the predictors ‘Pclass’, ‘Sex’, ‘Age’, ‘Siblings.Spouses.Aboard’, ‘Parents.Children.Aboard’, and ‘Fare’. Our model outputs an estimated probability of survival,

. To calculate the cohort Shapley values, we define similarity as exact for the discrete predictors ‘Pclass’, ‘Sex’, ‘Siblings.Spouses.Aboard’, and ‘Parents.Children.Aboard’ and a distance less than 1/20 of the variable range on the continuous predictors ‘Age’ and ‘Fare’.

Figure 1: Cohort Shapley values stacked vertically for all passengers, ordered by estimated survival probability. The black overlay is for each passenger.

Figure 1 shows the cohort Shapley values for each predictor stacked vertically for every individual. The individuals are ordered by their predicted survival probability. Starting at zero, we plot a blue bar up or down according to the cohort Shapley value for the sex variable. Then comes a yellow bar for Pclass and so on.

A visual inspection of Figure 1 reveals clusters of individuals with similar Shapley values for which we could potentially develop a narrative. As just one example, we see passengers with indices between roughly 325 and 500 who have negative Shapley values for ‘Sex’ but positive Shapley values for ‘Pclass’ while their predicted value is below the mean. Many of the these passengers are men who are not in the lowest class.

We also report some aggregate cohort Shapley values in Table 3 given by equation (8). We see that ‘Sex’ has a substantially larger aggregate impact than the other predictors. We can further dig into subgroups to see how the impact varies with the covariates. For example, ‘Sex’ has a far greater impact for women where being female predicts a higher survival rate than for men where being male predicts a lower survival rate. Similarly we see a disparity in the aggregate impact for ‘Pclass’ between 1st and 3rd class women as both groups on average are more likely to survive than the average passenger, and being in 1st class contributes to that positive residual while being in 3rd class detracts.

Average Aggregate Cohort Shapley
Logistic Regression Model True Model
Pred. All M F F & 1st F & 3rd All
Pclass 0.04 0.05 0.02 0.13 0.07 0.05
Sex 0.17 0.12 0.27 0.24 0.27 0.16
Age 0.02 0.02 0.02 0.02 0.02 0.05
Sib.Sp 0.01 0.01 0.02 0.02 0.02 0.03
Par.Ch 0.01 0.01 0.02 0.02 0.02 0.02
Fare 0.02 0.02 0.02 0.11 0.02 0.03
Avg 0.39 0.19 0.74 0.92 0.60 0.39
Table 3: Aggregate cohort Shapley per person among various groups for the logistic regression model and the true value model. 1st and 3rd refer to values of ‘Pclass’ and the final row is the mean fitted value within the group.

In the final column of Table 3, we consider the same data, but instead of using fitted values from logistic regression as our function of interest, we use the actual survival outcomes as our black box. We calculate the cohort Shapley values using the same characterization of similarity to obtain comparable aggregate cohort Shapley values. Though the impact of ‘Age’ is slightly higher for this model, overall we see very similar values. To some degree this is to be expected when the model fits the training data well.

6.2 Boston housing dataset

The Boston housing dataset has 506 data points with 13 predictors and the median house value as a response (Harrison and Rubinfeld, 1978)

. Each data point corresponds to a vicinity in the Boston area. We fit a regression model to predict the house price from the predictors using XGBoost

(Chen and Guestrin, 2016).

This dataset is of interest to us because it includes some striking examples of dependence in the predictors. For instance, the variables ‘CRIM’ (a measure of per capita crime) and ‘ZN’ (the proportion of lots zoned over 25,000 square feet) can be either near zero or large, but none of the 506 data points have both of them large and similar phenomena are in other scatterplots that we show below.

We will compute baseline Shapley and cohort Shapley for one target point. That one is the 205’th case in the sklearn python library and also in the mlbench R package. It is the unique one with ‘RM’. This target was chosen to be one for which some synthetic points in baseline Shapley would be far from any real data, but we did not optimize any criterion measuring that distance, and many of the other 506 points share that property. For cohort Shapley, we consider predictor values to be similar if their distance is less than 1/10 of the difference between the 95’th and 5’th percentiles of the predictor distribution.

Figure 2 shows two scatterplots of the Boston housing data. It marks the target and baseline points, depicts the cohort boundaries and it shows housing value in gray scale. The baseline point is , the sample average, and it is not any individual subject’s point partly because it averages some integer valued predictors. Here, the predicted house prices are 28.38 for the subject and 13.43 for the baseline. The figure also shows some of the synthetic points used by baseline Shapley. Some of those points are far from any real data points even in these two dimensional projections. There is a risk that the model fits for such points are not well determined.

Figure 2: Two scatterplots of the Boston housing data. The target point is a blue X. The baseline is a red X. Synthetic points used by baseline Shapley are orange X’s. Dashed blue lines delineate the cohorts we used.

Figure 3 shows baseline Shapley values for this target subject. We see that ‘CRIM’, ‘RM’, and ‘LSTAT’ have very large impact and the other variables do not. Figure 4 shows cohort Shapley values for this same subject. For cohort Shapley, the most impactful predictors are ‘RM’, ‘ZN’ and ‘LSTAT’ followed by a very gradual decline.

Figure 3: Baseline Shapley values for subject 205 of the Boston housing data.

In baseline Shapley ‘CRIM’ was the most important variable, while in cohort Shapley it is one of the least important variables. We think that the explanation is from the way that baseline Shapley uses the data values at the upper orange cross in the top plot of Figure 2. The predicted price for a house at the synthetic point given by the upper orange cross is 14.17, which is much smaller than that of the subject, and even quite close to the baseline mean. This leads to the impact of ‘CRIM’ being very high. Data like that synthetic point were not present in the training set and so that value represents an extrapolation where we do not expect a good prediction. We believe that an unreliable prediction there gave the extreme baseline Shapley value that we see for ‘CRIM’.

Figure 4: Cohort Shapley values for subject 205 of the Boston housing data.

Related to the prior point, refining the cohort on ‘RM’ reduces its cardinality much more than refining the cohort on ‘CRIM’ does. Because cohort Shapley uses averages of actual subject values, refining the target on ‘CRIM’ removes fewer subjects and in this case makes a lesser change.

The lower panel in Figure 2 serves to illustrate the effect of dependent predictors on cohort Shapley value. The model for price hardly uses ‘ZN’, if at all, and the baseline Shapley value for it is zero. Baseline Shapley atributes a large impact to ‘LSTAT’ and nearly none to ‘ZN’. For either of those predictors, the cohort mean is higher than the global average, and both ‘LSTAT’ and ‘ZN’ have high impact in cohort Shapley.

We can explain the difference as follows. As ‘ZN’ increases, the range of ‘LSTAT’ values narrows, primarily by the largest ‘LSTAT’ values decreasing as ‘ZN’ increases. Refining on ‘ZN’ has the side effect of lowering ‘LSTAT’. Even if ‘ZN’ itself is not in the model, the cohort Shapley value captures this effect. Baseline Shapley cannot impute a nonzero impact for a variable that the model does not use. We say more about this in Section 


7 Discussion

Cohort Shapley resolves two conceptual problems in baseline Shapley and many other methods. First, it does not use any impossible or even unseen predictor combinations. Second, if two predictors are identical then it is perfectly predictable that their importances will be equal rather than subject to details of which black box model was chosen or what random seed if any was used in fitting that model.

Baseline Shapley and cohort Shapley have different counterfactuals and they address different problems. In baseline Shapley, we start with predictors at a baseline level and consider the effects of moving them one at a time towards the target point. In cohort Shapley, we start behind a ‘veil of ignorance’ knowing only that the subject was in the data set and then reveal information about the predictors one at a time to focus on subjects more like the target. Baseline Shapley helps us understand what a target subject might have done differently while cohort Shapley helps us understand which predictors influenced the comparison of the target subject to the other subjects.

If a predictor is never used by the black box , then it will have a baseline Shapley value of zero. If that variable is correlated with a predictor that is used, then may still get nonzero impact in cohort Shapley. Knowing the value of tells us something about . For example, in some medical settings, we might find that the race of a patient has a cohort Shapley impact on their treatment even if their race was not used when was constructed.

Cohort Shapley requires quantities when there are predictor variables, and so for large it could be infeasible to compute it exactly. This is common to many but not all Shapley methods. For instance, Lundberg and Lee (2017) note that in tree structured algorithms many fewer than combinations may be required.

Cohort Shapley requires user input to define similarity. This is a strength and a weakness. It is a weakness because it places a burden on the user, while at the same time a strength in cases where the user has domain knowledge about what makes feature levels similar. There is a related literature on how finely a continuous variable should be broken into categories. Gelman and Park (2009) suggest as few as three levels for the related problem of choosing a discretization prior to fitting a model. We have used more levels but this remains an area for future research.

Cohort Shapley depends on the average predicted value in some potentially small cohorts, perhaps even the singleton for the target subject. Those predictions are normally made based on all observations and subject to regularization. As a result, or even itself, does not necessarily have a large variance. If however, is in an unusual part of the input space, then for large might be poorly determined due either to bias or variance. In such cases, we might see unusual importance scores. Those scores retain their interpretation as explanations for why differs from the average prediction, and if they are clearly intuitively unreasonable, then they serve to reveal problems in .


We thank Masashi Egi of Hitachi for valuable comments. This work was supported by the U.S. National Science Foundation under grant IIS-1837931.


  • Aliş and Rabitz (2001) Aliş, Ö. F. and Rabitz, H. (2001). Efficient implementation of high dimensional model representations. Journal of Mathematical Chemistry, 29(2):127–142.
  • Chen and Guestrin (2016) Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pages 785–794, New York, NY, USA. ACM.
  • Gamboa et al. (2016) Gamboa, F., Janon, A., Klein, T., Lagnoux, A., and Prieur, C. (2016). Statistical inference for Sobol’ pick-freeze Monte Carlo method. Statistics, 50(4):881–902.
  • Gelman and Park (2009) Gelman, A. and Park, D. K. (2009). Splitting a predictor at the upper quarter or third and the lower quarter or third. The American Statistician, 63(1):1–8.
  • Harrison and Rubinfeld (1978) Harrison, D. and Rubinfeld, D. L. (1978). Hedonic prices and the demand for clean air. Journal of Environmental Economics and Management, 5:81–102.
  • Jiang and Owen (2003) Jiang, T. and Owen, A. B. (2003). Quasi-regression with shrinkage. Mathematics and Computers in Simulation, 62(3-6):231–241.
  • Kuo et al. (2010) Kuo, F., Sloan, I., Wasilkowski, G., and Woźniakowski, H. (2010). On decompositions of multivariate functions. Mathematics of computation, 79(270):953–966.
  • Lundberg and Lee (2017) Lundberg, S. M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, pages 4765–4774.
  • Molnar (2018) Molnar, C. (2018). Interpretable machine learning: A Guide for Making Black Box Models Explainable. Leanpub.
  • Owen (2013) Owen, A. B. (2013). Variance components and generalized Sobol’ indices. Journal of Uncertainty Quantification, 1(1):19–41.
  • Owen (2014) Owen, A. B. (2014). Sobol’ indices and Shapley value. Journal on Uncertainty Quantification, 2:245–251.
  • Owen and Prieur (2017) Owen, A. B. and Prieur, C. (2017). On Shapley value for measuring importance of dependent inputs. SIAM/ASA Journal on Uncertainty Quantification, 5(1):986–1002.
  • Pearl (2009) Pearl, J. (2009). Causal inference in statistics: An overview. Statistics surveys, 3:96–146.
  • Ribeiro et al. (2016) Ribeiro, M. T., Singh, S., and Guestrin, C. (2016).

    Why should I trust you?: Explaining the predictions of any classifier.

    In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, New York. ACM.
  • Saltelli et al. (2008) Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D., Saisana, M., and Tarantola, S. (2008). Global Sensitivity Analysis. The Primer. John Wiley & Sons, Ltd, New York.
  • Shapley (1952) Shapley, L. S. (1952). A value for n-person games. Technical report, DTIC Document.
  • Sobol’ (1969) Sobol’, I. M. (1969). Multidimensional Quadrature Formulas and Haar Functions. Nauka, Moscow. (In Russian).
  • Sobol’ (1993) Sobol’, I. M. (1993). Sensitivity estimates for nonlinear mathematical models. Mathematical Modeling and Computational Experiment, 1:407–414.
  • Sobol’ (2003) Sobol’, I. M. (2003). Theorems and examples on high dimensional model representation. Reliability Engineering & System Safety, 79(2):187–193.
  • Song et al. (2016) Song, E., Nelson, B. L., and Staum, J. (2016). Shapley effects for global sensitivity analysis: Theory and computation. SIAM/ASA Journal on Uncertainty Quantification, 4(1):1060–1083.
  • Štrumbelj and Kononenko (2010) Štrumbelj, E. and Kononenko, I. (2010). An efficient explanation of individual classifications using game theory. Journal of machine learning research, 11:1–18.
  • Štrumbelj and Kononenko (2014) Štrumbelj, E. and Kononenko, I. (2014). Explaining prediction models and individual predictions with feature contributions. Knowledge and information systems, 41(3):647–665.
  • Sundararajan and Najmi (2019) Sundararajan, M. and Najmi, A. (2019). The many Shapley values for model explanation. In Proceedings of the ACM Conference ’17, New York. ACM.
  • Zhao and Hastie (2019) Zhao, Q. and Hastie, T. (2019). Causal interpretations of black-box models. Journal of Business & Economic Statistics, pages 1–19. to appear.