Human-Centered Explainable AI (XAI): From Algorithms to User Experiences

10/20/2021
by   Q Vera Liao, et al.
ibm
Microsoft
44

As a technical sub-field of artificial intelligence (AI), explainable AI (XAI) has produced a vast collection of algorithms, providing a toolbox for researchers and practitioners to build XAI applications. With the rich application opportunities, explainability has moved beyond a demand by data scientists or researchers to comprehend the models they are developing, to become an essential requirement for people to trust and adopt AI deployed in numerous domains. However, explainability is an inherently human-centric property and the field is starting to embrace human-centered approaches. Human-computer interaction (HCI) research and user experience (UX) design in this area are becoming increasingly important. In this chapter, we begin with a high-level overview of the technical landscape of XAI algorithms, then selectively survey our own and other recent HCI works that take human-centered approaches to design, evaluate, provide conceptual and methodological tools for XAI. We ask the question "what are human-centered approaches doing for XAI" and highlight three roles that they play in shaping XAI technologies by helping navigate, assess and expand the XAI toolbox: to drive technical choices by users' explainability needs, to uncover pitfalls of existing XAI methods and inform new methods, and to provide conceptual frameworks for human-compatible XAI.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

01/08/2020

Questioning the AI: Informing Design Practices for Explainable AI User Experiences

A surge of interest in explainable AI (XAI) has led to a vast collection...
02/10/2022

Investigating Explainability of Generative AI for Code through Scenario-based Design

What does it mean for a generative AI model to be explainable? The emerg...
09/15/2022

Responsible AI Implementation: A Human-centered Framework for Accelerating the Innovation Process

There is still a significant gap between expectations and the successful...
03/29/2021

Situated Case Studies for a Human-Centered Design of Explanation User Interfaces

Researchers and practitioners increasingly consider a human-centered per...
04/14/2022

The Vision of a Human-Centered Piano

For around 300 years, humans have been learning to play the modern piano...
06/22/2022

Connecting Algorithmic Research and Usage Contexts: A Perspective of Contextualized Evaluation for Explainable AI

Recent years have seen a surge of interest in the field of explainable A...
04/08/2021

Question-Driven Design Process for Explainable AI User Experiences

A pervasive design issue of AI systems is their explainability–how to pr...

1. Introduction

In everyday life, people seek explanations when there is a gap of understanding. Explanations are sought for many goals that this understanding is meant to serve, such as predicting future events, diagnosing a problem, resolving cognitive dissonance, assigning blame, and rationalizing one’s action. In interactions with computing technologies, an appropriate understanding of how the system works, often referred to as the user’s “mental model” (Norman, 2013), is the foundation for users to correctly anticipate system behaviors and interact effectively. A user’s understanding is constantly being shaped by what they see and experience with the system, and can be refined by being directly explained how the system works.

With the increasing adoption of AI technologies, especially popular inscrutable “opaque-box” machine learning (ML) models such as neural networks models, understanding becomes increasingly difficult. Meanwhile, the need for stakeholders to understand AI is heightened by the uncertain nature of ML systems and the hazardous consequences they can possibly cause as AI is now frequently deployed in high-stakes domains such as healthcare, finance, transportation, and even criminal justice. Some are concerned that this challenge of understanding will become the bottleneck for people to trust and adopt AI technologies. Others have warned that a lack of human scrutiny will inevitably lead to failures in usability, reliability, safety, fairness, and other moral crises of AI.

It is with this overwhelming challenge of modern AI that the term explainable AI (XAI) has made its way into numerous academic works, industry efforts, as well as public policy and regulatory requirements. For example, the European Union General Data Protection Regulation (GDPR) now requires that “meaningful information about the logic involved” must be provided to people who are affected by automated decision-making systems. However, despite a vast collection of XAI algorithms produced by the AI research community and a recent emergence of off-the-shelf toolkits (e.g. (50; V. Arya, R. K. E. Bellamy, P. Chen, A. Dhurandhar, M. Hind, S. C. Hoffman, S. Houde, Q. V. Liao, R. Luss, A. Mojsilovic, S. Mourad, P. Pedemonte, R. Raghavendra, J. Richards, P. Sattigeri, K. Shanmugam, M. Singh, K. R. Varshney, D. Wei, and Y. Zhang (2020); 42; 71; 75)) for AI developers to incorporate state-of-the-art XAI techniques in their own models, successful examples of XAI are still relatively scarce in real-world AI applications.

Developing XAI applications is challenging because explainability, or the effectiveness of explanation, is not intrinsic to the model but lies in the perception and reception of the person receiving the explanation. Making a model completely transparent to its nuts and bolts does not guarantee that the person at the receiving end can make sense of all the information, or is not overwhelmed. What makes an explanation good—to provide appropriate information that can be understood and utilized—is contingent on the receiver’s current knowledge and their goal for receiving the explanation, among other human factors.

Therefore, developing XAI applications requires human-centered approaches that center the technical choices around people’s explainability needs, and define success by human experience and well-being. It also means that XAI presents as much of a design challenge as an algorithmic challenge. Hence there are rich opportunities for HCI researchers and design practitioners to contribute insights, solutions, and methods to make AI more explainable. A research community of human-centered XAI (Ehsan et al., 2021c; Ehsan and Riedl, 2020; Wang et al., 2019) has emerged, which bring in cognitive, sociotechnical, design perspectives, and more. We hope this chapter serves as a call to engagement in this inter-disciplinary endeavor by presenting a selected overview of recent AI and HCI work on the topic of XAI. While the growing collection of XAI algorithms offer a rich toolbox for researchers and practitioners to build XAI applications, we highlight three ways that human-centered approaches can help navigating, assessing and expanding this toolbox:

  • There is no one-fits-all solution in the growing collection of XAI techniques. The technical choice should be driven by users’ explainability needs, for which HCI and user research can offer insights to characterize the space and methodological tools (Section 3).

  • Empirical studies with real users can reveal pitfalls of existing XAI methods. To overcome the pitfalls requires both design efforts to fill the gaps and reflectively challenging fundamental assumptions in techno-centric views (Section 4).

  • Theories of human cognition and behaviors can offer conceptual tools to inspire new computational and design frameworks for XAI. However, this is still a nascent area and relevant theories in social science, behavioral science, and information science are yet to be explored (Section 5).

Before getting to these points, we will start with a brief overview of the technical landscape of XAI to ground our discussions (Section 2). For interested readers, we suggest several recent papers that provided deeper technical surveys (Guidotti et al., 2019; Adadi and Berrada, 2018; Arrieta et al., 2020; Carvalho et al., 2019).

2. What is explainable AI and what are the techniques?

The definitions of explainability and related terms such as transparency, interpretability, intelligibility, and comprehensibility, are in a bit of flux. Scholars sometimes disagree on their scopes and how these terminologies intersect. However, XAI work often shares a common goal of making AI understandable by people. By adopting this pragmatic, human-centered definition in the chapter, we consider XAI as broadly encompassing all technical means to this end of understanding, including direct interpretability, generating an explanation or justification, providing transparency information, etc. (and avoid the philosophical question “what is or is not an explanation” altogether (Páez, 2019)). We note a distinction between a narrow scope of XAI focusing on explaining the model processes or internals, versus a broad scope that covers all explanatory information about the model, also including the training data, performance, uncertainty, and so on (Liao et al., 2020; Vaughan and Wallach, 2020). Since our focus is on XAI applications, we believe a broad view is necessary as users are often interested in a holistic understanding of the system. However, technical challenges are commonly presented in the inscrutability of the model internals, so the XAI techniques we review in this section are within the narrower scope of XAI.

It is worth mentioning that while the majority of XAI work focuses on ML models, and so does this chapter, there are emerging areas of other types of XAI including explainable planning, multi-agent systems, robotics, etc. In fact, the term XAI was coined half a century ago in the context of expert systems. We are currently in its second wave spurred by the popularity of ML.

At a high level, XAI techniques falls into two camps (Lipton, 2018; Guidotti et al., 2019)

: 1) choosing a directly interpretable model, such as simpler models like decision trees, rule-based models, and linear regression; 2) choosing an “opaque-box” model such as deep-neural networks and large tree ensembles, and then using a post-hoc technique to generate explanations. The choice between the two is sometimes discussed under the term “performance-interpretability tradeoff”, as complex opaque-box models tend to perform better in many tasks. However, this tradeoff is not always true. Research has shown that in many contexts, especially with well-structured datasets and meaningful features, directly interpretable models can reach comparable performance than opaque-box models 

(Rudin, 2019). Moreover, an active research area of XAI focuses on developing new algorithms that possess both performance advantages and interpretability properties. For example, decision sets (Lakkaraju et al., 2016), generalized linear rule models (Wei et al., 2019), GA2Ms (Caruana et al., 2015), and CoFrNets (Puri et al., 2021) are recent algorithms that have more advanced computational properties than simple rule-based or linear models, but the model behaviors are still represented in meaningful rules or coefficients that can be understood relatively easily.

However, opaque-box models are often chosen in practice because of their performance advantage for a given dataset, often lower requirement for human effort (e.g., feature engineering), or the availability of off-the-shelf solutions. In these cases, one will have to use a post-hoc XAI technique. Based on their purpose, Guidotti et al. (Guidotti et al., 2019) categorize post-hoc XAI techniques into global explanation on the overall logic of the model, local explanation on a particular prediction, and counterfactual inspection that supports understanding how the model would behave with alternative input. Within these categories, XAI techniques commonly generate either feature-based explanations to elucidate the model internals, or example-based explanations to support case-based reasoning.

Note the three categories also apply to directly interpretable models. For example, a shallow decision-tree can be presented directly as a global explanation, highlighted of a particular path to locally explain a prediction, or traced in alternative paths to perform counterfactual inspection. It is, however, much less straightforward with opaque-box models, which require separate post-hoc techniques, as we will give some examples below.

Examples of global explanation. Since it is impossible to understand the complex internals of an opaque-box model, the goal of global explanation is to provide an approximate overview of how the model behaves. This is often done by training a simple directly interpretable model such as decision tree, rule sets or regression with the same training data, and performing optimization to make the simple model behaving more closely to the original model. For example, a technique called distillation changes the learning objective of the interpretable model to matching the original model’s predictions (Tan et al., 2018). SRatio reweighs the training data based on the original model’s predictions and then re-trains the interpretable model (Dhurandhar et al., 2020). With these approaches, depending on the choice of the approximate model, global explanations can take the form of a decision-tree, a set of rules the model follows, or feature weights.

Examples of local explanation.

To explain a prediction made on an instance, a number of algorithms can be used to estimate the importance of each feature of this instance for the model’s prediction. For example, LIME (local interpretable model-agnostic explanations) 

(Ribeiro et al., 2016) starts by adding a small amount of noise to the instance to create a set of neighbor instances, with them it fits a simple linear model that mimics the original model’s behavior in the local region. The linear model’s weights can then be used as the feature importance to explain the prediction. Another popular algorithm SHAP (Shapley additive explanations) (Lundberg and Lee, 2017)

defines feature importance based on Shapley values, inspired by cooperative game theory, to assign credit to each feature. Feature-importance explanations can be shown to users by visualizing the importance, or simply describing the most important features for the prediction. To explain deep neural networks, many other algorithms can be used to identify important parts of input features based on gradient 

(Selvaraju et al., 2017), propagation (Bach et al., 2015), occlusion (Li et al., 2016), etc. They are sometimes referred to as saliency methods and, when applied to image data, generate saliency maps. Example-based methods are useful to explain a prediction as well. For example, with some notion of similarity, finding similar instances in the training data with the same predicted outcome can be used to justify the prediction (Kim et al., 2016; Gurumoorthy et al., 2019).

Examples of counterfactual inspection. Different from local explanations that describe the model’s prediction process for a given instance, counterfactual explanations–“counter to the facts”–are sought when people are interested in how the model would behave when the current input changes. In other words, people are interested in the “why not a different prediction” or “how to change to get a different prediction” questions rather than a descriptive “why” question. Such explanations are especially sought when seeking recourse to a current, often undesirable, prediction, such as ways to improve a patient’s predicted high risk of disease. Several algorithms can be used to generate conterfactual explanations by identifying changes, often with some notion of minimum changes, needed for an instance to receive a different prediction (Dhurandhar et al., 2018a; Looveren and Klaise, 2021). They are sometimes referred to as contrastive explanations for a counterfactual outcome (differentiated from other kinds of counterfactuals such as counterfactual causes). Example-based methods can also be used to generate counterfactual examples–instances with minimum difference from the original one but having a different outcome (Mothilal et al., 2019; Wachter et al., 2017). In other situations, people may want to zoom in on a specific feature and explore how its changes impact the model’s prediction, i.e. asking a “what if” question. For this purpose, feature inspection techniques such as partial dependence plot (PDP) (Hastie et al., 2009) and individual conditional expectation (ICE) can be used (Goldstein et al., 2015).

Most post-hoc techniques make some approximations. Distillation and LIME approximate the complex model’s behaviors with a simpler model’s. PDP leaves out interactions between features. Example-based methods explain by samples in the data. There is a long-standing debate regarding the potential risk of using approximate post-hoc techniques to explain instead of a directly interpretable model, as approximation will inevitably leave out some corner cases or even be unfaithful to what the original model computes (Rudin, 2019).

However, in addition to the practical reasons mentioned earlier to opt for an opaque-box model, there is a pragmatic argument to be made about the diverse communication devices people use to reach “sufficient understanding” to achieve a given objective. For example, if one is to make precise diagnosis of a problem, they may need explanations that describe a causal chain; whereas if one’s goal is to predict future events, following approximate rules or case-based reasoning could be sufficient and less cognitively demanding. One can also argue that when the model and the person have different epistemic access, approximation can be seen as a form of translation necessary to bridge the two. There is an emerging area of XAI research on generating human-consumable explanations with supervision of human explanations (Hind et al., 2019; Ehsan et al., 2019; Kim et al., 2018), which essentially translates model reasoning into meaningful human explanations applied to the same prediction. This kind of explanation is a complete approximation but could be especially useful for lay people who have difficulty understanding how ML models work, but want to get a sense of the reasonability of a prediction. We further highlight this objective and user dependent nature of the choice of explanation methods in the next sections.

That being said, developers of AI have a responsibility to understand, mitigate, and transparently communicate the limitations of approximate explanations to stakeholders. For example, an explainability metric known as faithfulness can be used to detect faulty post-hoc explanations (Alvarez-Melis and Jaakkola, 2018). This is an actively researched topic and there is still a lack of principled approaches to identify and communicate the limitations of post-hoc explanations.

3. Diverse explainability needs of AI stakeholders

It is easy to see that there is no “one-fits-all” solutions from this vast, and still rapidly growing, collection of XAI algorithms, and the choice should be based on target users’ explainability needs. The challenge here are twofold: First, users of XAI are far from a uniform group and their explainability needs can vary significantly depending on their goals, backgrounds, usage contexts, etc. Second, XAI algorithms were often not developed with specific usage contexts in mind, or were developed primarily to help model developers or AI researchers inspect the model (Miller et al., 2017). Hence their appropriateness to support an end users’ explainability needs can be unclear.

A starting point to address these challenges is to map out the design space of XAI and develop frameworks that account for people’s diverse explainability needs. Many have summarized common user groups that demand explainability and what they would use AI explanations for (Arrieta et al., 2020; Preece et al., 2018; Hind, 2019):

  • Model developers, to improve or debug the model.

  • Business owners or administrators, to assess an AI application’s capability, regulatory compliance, etc. to determine its usage.

  • Decision-makers, who are direct users of AI decision support applications, to form appropriate trust in the AI and make informed decisions.

  • Impacted groups, whose life could be impacted by the AI, to seek recourse or contest the AI.

  • Regulatory bodies, to audit for legal or ethical concerns such as fairness, safety, privacy, etc.

While useful for considering different personas interacting with XAI, this kind of categorization lacks granularity to characterize people’s explainability needs. For example, a doctor using a patient risk-assessment AI (i.e., a decision-maker) would want to have an overview of the system during the onboarding stage, but delve into AI’s reasoning for a particular patient’s risk assessment when they treat the patient. Also, people in any of these groups may want to assess model capabilities or biases at certain usage points.

In a recent HCI paper, Suresh et al. define stakeholder’s knowledge and their objectives as two components that cut across to determine one’s explainability needs (Suresh et al., 2021). The authors characterize stakeholders’ knowledge by formal, instrumental, and personal knowledge and how it manifests in the contexts of machine learning, data domain, and general milieu. For stakeholders’ goals and objectives, the authors propose a multi-level typology, ranging from long-term goals (building trust and understanding the model), immediate objectives (debug and improve, ensure compliance with regulations, take actions based on model output, justify actions influenced by a model, understand data usage, learn about a domain, contest model decisions), and specific tasks to perform with explanations (assess reliability of a prediction, detect mistakes, understand information used by the model, understand feature influence, understand model strengths and limitations).

While these efforts can be seen as top-down approaches to characterize the overall space of explainability needs, a complementary approach is to follow user-centered design and start with user research to identify application or interaction specific explainability needs. For example, Eiband et al. proposed a participatory design method that starts with analyzing users’ current mental model and gaps with how the system should be understood, based on an appropriate mental model prescribed by experts, to identify what needs to be explained (Eiband et al., 2018).

In our own research with collaborators, we proposed to identify users’ explainability needs by eliciting user questions to understand the AI (Liao et al., 2020). This notion is based on prior HCI work using prototypical questions to represent “intelligibility types” (Lim and Dey, 2009), and social science literature showing that people’s explanatory goals can be expressed in different kinds of questions (Hilton, 1990). By interviewing 20 designers, we collected common questions users ask across 16 ML applications and developed an XAI Question Bank, with more than more than 50 detailed user questions organized in 9 categories:

  • How (global model-wide): asking about the general logic or process the AI follows to have a global view.

  • Why (a given prediction): asking about the reason behind a specific prediction.

  • Why Not (a different prediction): asking why the prediction is different from an expected or desired outcome.

  • How to be That (a different prediction)111The difference between Why Not and How to Be That can be subtle and context-dependent. User may ask Why Not when seeing an unexpected prediction and interested in comparing what gets the counterfactual outcome. User may ask How to Be That when seeking recourse so the explanation should more specifically focus on minimum or actionable changes they can make to the current input.: asking about ways to change the instance to get a different prediction.

  • How to Still Be This (the current prediction): asking what change is allowed for the instance to still get the same prediction.

  • What if: asking how the prediction changes if the input changes.

  • Performance: asking about the performance or of the AI.

  • Data: asking about the training data.

  • Output: asking what can be expected or done with the AI’s output.

These questions demonstrate that XAI should be defined broadly, not limited to explaining model internals, as users are also interested in explanatory information about the performance, data, and scope of output, among other dimensions.

This XAI question bank maps out the space of common explainability needs and can be used as a tool to identify applicable questions in user research. In a follow-up work (Liao et al., 2021), we propose a question-driven user centered design method that starts with identifying key user questions with user research, using them to guide the choice of XAI techniques and iterative design. To facilitate this process and foreground users’ explanability needs, we suggest reframing the technical space of XAI by the user question that each XAI method can address. For example, a feature-importance local explanation technique can answer the Why question, while a counterfactual explanation can answer the How to be That question. We provide a suggested mapping between the question categories and example XAI methods in Table 1

, focusing on post-hoc methods that are available in current open-source XAI toolkits accessible for practitioners 

(50; 75; 42; 71).

Question Ways to explain Example XAI methods
How (global model-wide)     Describe the general model logic as feature impact, rules or decision-trees
    If user is only interested in a high-level view, describe what are the top features or rules considered
ProfWeight (Dhurandhar et al., 2018b), Global feature importance (Wei et al., 2015; Lundberg et al., 2020), Global feature inspection plots (e.g. PDP (Hastie et al., 2009)), Tree surrogates (Craven and Shavlik, 1995)
Why (a given prediction)     Describe how features of the instance, or what key features, determine the model’s prediction of it
    Or describe rules that the instance fits to guarantee the prediction
    Or show similar examples with the same predicted outcome to justify the model’s prediction
LIME (Ribeiro et al., 2016), SHAP (Lundberg and Lee, 2017), LOCO (Lei et al., 2018), Anchors (Ribeiro et al., 2018), ProtoDash  (Gurumoorthy et al., 2019)
Why Not (a different prediction)     Describe what features of the instance determine the current prediction and/or with what changes the instance would get the alternative prediction
    Or show prototypical examples that have the alternative outcome
CEM (Dhurandhar et al., 2018a), Counterfactuals (Looveren and Klaise, 2021), ProtoDash (on alternative prediction) (Gurumoorthy et al., 2019)
How to Be That (a different prediction)     Highlight feature(s) that if changed (increased, decreased, absent, or present) could alter the prediction to the alternative outcome, with minimum effort required
    Or show examples with minimum differences but had the alternative outcome
CEM (Dhurandhar et al., 2018a), Counterfactuals (Looveren and Klaise, 2021), Counterfactual instances (Wachter et al., 2017), DiCE (Mothilal et al., 2019)
How to Still Be This (the current prediction)     Describe features/feature ranges or rules that could guarantee the same prediction
    Or show examples that are different from the instance but still had the same outcome
CEM (Dhurandhar et al., 2018a), Anchors (Ribeiro et al., 2018)
What if     Show how the prediction changes corresponding to the inquired change of input PDP (Hastie et al., 2009), ALE (Apley and Zhu, 2020), ICE (Goldstein et al., 2015)
Performance     Provide performance information of the model
    Provide uncertainty information for each prediction
    Describe potential strengths and limitations of the model
Precision, Recall, Accuracy, F1, AUC; Communicate uncertainty of each prediction (Ghosh et al., 2021); See examples in FactSheets (Arnold et al., 2019) and Model Cards (Mitchell et al., 2019)
Data     Provide comprehensive information about the training data, such as the source, provenance, type, size, coverage of population, potential biases, etc. See examples in FactSheets (Arnold et al., 2019) and DataSheets (Gebru et al., 2018)
Output     Describe the scope of output or system functions.
    If applicable, suggest how the output should be used for downstream tasks or user workflow
See examples in FactSheets (Arnold et al., 2019) and Model Cards (Mitchell et al., 2019)
Table 1. A mapping between categories of user questions in XAI question bank (Liao et al., 2020) and example XAI methods to answer these questions, with descriptions of their output in “Ways to explain” column. XAI methods are selected based on what are available in current open-source XAI toolkits (50; 71; 75; 42). The last three rows (in italic) are broader XAI needs not limited to explaining model processes

In short, a growing collection of XAI techniques offer a rich toolbox for researchers and practitioners to build XAI applications. Making effective and responsible choices from this toolbox should be guided by users’ explainability needs. HCI research not only offers means to understand user needs for specific applications, but also insights about real-world user needs to better frame and organize this toolbox, as well as methodological tools to help navigating the toolbox. In the next section, we discuss how HCI research can also inform limitations of the current technical XAI toolbox.

4. Pitfalls of XAI: Minding the gaps between algorithmic explanations and actionable understanding

With so many XAI algorithms developed, one must ask: do they work? The answer is complicated because of the diverse contexts that XAI is sought for. The answer is also difficult because it requires understanding how people perceive, process, and use AI explanations. HCI research, and more broadly human-subject studies, are key to evaluating XAI in the context of use (Doshi-Velez and Kim, 2017), identifying where it falls short, and informing human-centered solutions. While many studies showed positive results that XAI techniques can improve people’s understanding of models (Lucic et al., 2020; Ribeiro et al., 2018; Hase and Bansal, 2020; Lakkaraju et al., 2017), in this section we draw attention to a few pitfalls of XAI based on recent HCI research.

We start with the position that users’ goal with XAI is not an understanding defined in a vacuum, but an actionable understanding that is sufficient to serve the objective that they seek explanations for. As discussed above, these objectives are diverse and dynamic. One common pitfall and overall obstacle for the current XAI field is that, despite growing efforts, there is still a disconnect between technical XAI approaches and their effectiveness in supporting different user objectives in downstream deployment. A recent study by Buçinca et al. points out that “proxy tasks” widely used by AI researchers to evaluate their proposed XAI techniques can be misleading (Buçinca et al., 2020). A common example of proxy task is a “simulation test”, which asks people to predict the model’s output based on an input and explanation. Such tests to assess people’s understanding without a specific end goal can fail to predict the success of using XAI in real tasks that people seek explanations for, such as debugging models (Kaur et al., 2020) or improving decision-making (Buçinca et al., 2020; Wang and Yin, 2021).

There can be a multitude of reasons for this divide between effectiveness in proxy tasks and deployment. As Buçinca et al. point out, performing proxy tasks in a controlled setting could induce a different cognitive process from a realistic setting, such as granting more attention to the explanations (Buçinca et al., 2020). Moreover, the ability to simulate a model prediction may simply not match the need for a user to perform a realistic task. For instance, in the context of decision-making, the key to success of a human-AI team is appropriate reliance–knowing when to trust the AI’s recommendation and when to be cautious. An actionable understanding for appropriate reliance requires not only knowing how the model makes predictions, but also how to judge if the reasoning is flawed. Filling this gap of understanding may necessitate a different kind of transparency. For example, recent HCI studies repeatedly found that showing uncertainty information of individual predictions is more effective than local explanations to help people achieve the objective of appropriate reliance on AI (Zhang et al., 2020; Bansal et al., 2021).

Figure 1. Model LineUpper, an example XAI tool that supports ML developers to compare multiple candidate models by comparing their feature-importance explanations at multiple levels (by selecting an instance, a region of instances, or viewing all from the right-side Scatterplot Matrix panel) from Narkar et al. (Narkar et al., 2021)

To close this gap between technical XAI and user experiences will require both studying user interactions with XAI in the contexts of use, and better operationalizing human-centered perspectives in algorithmic work of XAI, including developing evaluation methods that can account for more fine-grained user needs and downstream usage contexts.

For the first part, recent HCI research provides useful insights for XAI user experiences in some common usage contexts. For model development or debugging, research suggests that users often need a range of explanations for different levels of the model behaviors to perform comprehensive diagnosis (Hohman et al., 2019; Narkar et al., 2021; Hong et al., 2020). For example, Figure 1 shows Model LineUpper (Narkar et al., 2021), an XAI tool that we designed with our collaborators to support data scientists to compare multiple models (in the context of choosing candidate models generated by AutoML) by comparing their feature-importance explanations. This comparison can happen at different levels: for a global view, for a input region they select on the Scatterplot Matrix on the right, and down to individual instances. For decision-makers, such as users of an AI system supporting medical diagnosis (Cai et al., 2019; Xie et al., 2020; Liao et al., 2020) , they may need upfront global explanations about the properties of the model during the on-boarding stage, but local explanations particularly when they get unexpected or suspicious model output. When it comes to auditing for model fairness or biases, our work with collaborators compared the effectiveness of four types of explanation (shown in Figure 2) and found that contrastive explanations can effectively help people identify concerns of individual fairness, where similar individuals are treated differently by the model (Dodge et al., 2019).

Figure 2. Four types of XAI features compared in Dodge et al. (Dodge et al., 2019) (with minor updates on the names of explanations from the original paper) to support people’s fairness judgment of ML models, with an ML model performing recidivism risk prediction as a use case

Empirical research also brings to light a set of pitfalls resulted from a disconnect between assumptions underlying technical approaches to XAI and people’s cognitive processes. A pitfall robustly found in recent work is that explanations can lead to unwarranted trust or confidence in the model. In a controlled experiment where an ML model was used to assist participants to predict apartment sales prices, Poursabzi-Sangdeh et al. found that, contrary to the hypothesis, showing people an explainable model with feature importance hindered their ability to detect model mistakes (Poursabzi-Sangdeh et al., 2021). By conducting contextual inquiry with data scientists using popular XAI techniques (e.g. SHAP) during model development, Kauer et al. found that the existence of explanations could mistakenly lead to over-confidence that the model is ready for deployment (Kaur et al., 2020). In the context of a nutrition recommender, Eiband et al. showed that even placebic explanations, which did not convey useful information, invoked a similar level of trust as real explanations do (Eiband et al., 2019). In addition, there is the concern of illusory understanding, with which one subjectively over-estimates the understanding they gain from XAI (Chromik et al., 2021). Explanations can also create information overload and distract people from forming a useful mental model of how a system operates (Springer and Whittaker, 2019).

These observations highlight the danger of deploying technologies without a clear understanding of how people interact with them. One way to move the field forward is to connect with theories and insights about human behaviors and cognition. For example, dual-process theories (Kahneman, 2011; Petty and Cacioppo, 1986)

provide a critical lens to understand how people process XAI and inform new means to make AI more understandable and actionable to users. The central thesis of dual-process theories is that people can engage in two different systems to process information and make decisions. System 1 is intuitive thinking, often following mental shortcuts and heuristics; System 2 is analytical thinking, relying on careful reasoning of information and arguments. Because System 2 is slower and more cognitively demanding, people often resort to System 1 thinking, which, when applied inappropriately, can lead to cognitive biases and sub-optimal decisions. Through this theoretical lens, there is an increasing awareness 

(Buçinca et al., 2020; Wang et al., 2019; Ehsan et al., 2021b; Nourani et al., 2021; Rastogi et al., 2022) that while XAI techniques make an implicit assumption that people can and will attend to every bit of explanations, in reality people are more likely to engage in System 1 thinking.

However, it remains an open question what kind of heuristics can be triggered by XAI with System 1 thinking. It is possible that people associate the ability to provide explanations directly with competence, and therefore form unwarranted trust and confidence. Heuristics are developed through past experiences, and can evolve as people experience new technologies or domains. Nourani et al. demonstrated that when interacting with XAI, people were vulnerable to common cognitive biases such as anchoring bias after observing model behaviors early on (Nourani et al., 2021). A recent study by Ehsan et al. uncovered diverse heuristics people follow in response to AI explanations, such as associating explanations with affirmation, diagnostic support, and social presence, and associating a specific presentation of explanation, such as numerical numbers, with intelligence and algorithmic thinking (Ehsan et al., 2021b).

Another critical implication with dual-process theories is that people do not equally engage in System 1 or System 2 thinking in all contexts. People are generally inclined to engage in System 1 thinking when they lack either the ability or motivation to perform analytical thinking (Petty and Cacioppo, 1986). This difference can lead to another pitfall of XAI–potential inequalities of experience including risks subject to mistrust and misuse of AI. For example, a study found that AI novices, compared to experts, not only had less performance gain from XAI but were also more likely to have illusory satisfaction (Szymanski et al., 2021). Other studies suggest that in time and cognitive resource constraint settings people are less able to process explanations effectively (Xie et al., 2020; Robertson et al., 2021). In our own work with collaborators (Ghai et al., 2020)

, we showed that adding explanations in an active learning setting (i.e. label instances requested by the model) decreased satisfaction for people scored low in Need for Cognition, a personality trait reflecting one’s general motivation to engage in effortful cognitive activities.

Research has begun to address this mismatch between people’s cognitive processes and current assumptions underlying XAI. One way is to provide interventions to nudge people to engage deeper in System 2 thinking. Buçinca et al. introduced cognitive forcing functions as design interventions for that purpose (Buçinca et al., 2021), including asking users to make decisions before seeing the AI’s recommendations, slowing down the process, and letting users choose when to see the AI recommendation. In our own work with collaborators (Rastogi et al., 2022), we saw that increasing the time for the users to interact with the ML system mitigated some System 1 biases. Another path is to seek technical and design solutions that reduce the cognitive workload imposed by XAI, by reducing the quantity and improving consumability of information. For example, studies suggest that muti-modalities (text, visual, audio, etc.) can be leveraged to aid attention and understanding of XAI (Robertson et al., 2021; Szymanski et al., 2021). Progressive disclosure (Springer and Whittaker, 2019), starting with simplified or high-level transparency information and revealing details later or upon user requests, is another effective approach to reduce cognitive workload. Technical approaches that optimize for a balance between explanation accuracy and conciseness have also been explored (Abdul et al., 2020).

We must note that heuristics are an indispensable part of people’s decision-making process. If applied appropriately, they can aid people to make more efficient and optimal decisions. In fact, they may be key to closing the inequality gaps for people with different levels of ability or motivation to process information about AI. For example, we may envision a quality endorsement feature through some authorized third-party inspecting a model with explanations. This could allow lay people to apply a reliable “authority heuristic”. Understanding what heuristics are involved in interactions with XAI and AI in general, and how to leverage reliable heuristics to improve human-AI interaction, are important open questions for the field.

We close this section with an optimistic note that by centering our analysis on people, on how they interact with and process information about AI, and whether they can achieve their objectives, we can move away from a techno-centric focus on generating algorithmic explanations. We can begin to identify opportunities to improve user experiences in the currently under-developed space between algorithmic explanations and actionable understanding, and appreciate explainable AI as much of a design problem as a technical problem. The design solutions may be concerned with how to communicate algorithmic explanations, such as choosing the right modalities, level of abstraction, work-arounds for privacy or security constraints, and so on. They may also come in the form of interventions to influence how people process XAI, such as providing cognitive forcing functions or checklists that help people better assess information (Rieh and Danielson, 2007). Furthermore, it is necessary to fill the knowledge or information gaps for users to achieve actionable understanding beyond algorithmic explanations, such as providing necessary domain knowledge (e.g. what a feature means) and general notions of how AI works.

While we discussed the pitfalls of XAI mostly through a cognitive lens, implicit in supporting actionable understanding is a requirement to approach XAI as a sociotechnical problem (Ehsan and Riedl, 2020), especially given that consequential AI systems are often embedded in socio-organizational contexts with their own history, shared knowledge and norms. On the one hand, for XAI technology developers, to understand the “who” in XAI and articulate their needs and objectives requires situating “who” in the sociotechnical context. On the other hand, for XAI users, an actionable understanding is often a socially situated understanding, which enables them to make sense of not only the technical component but also the sociotechnical system as a whole. Motivated by this sociotechnical perspective, with collaborators we proposed the concept of social transparency–making visible the social-organizational factors that govern the use of AI systems (Ehsan et al., 2021a). Operationalized in a design framework to present past users’ interactions and reasoning with the AI (see a design in Figure 3), we demonstrated that such information could help users make more informed decisions and improve the collective experience with AI as a sociotechnical system.

Figure 3. A scenario-based design of Social Transparency in AI systems used in (Ehsan et al., 2021a). It combines technical explanations and “4W features” (What, Who, Why. and When) that reflect the historical decision trajectory of other users

5. Theory-Driven Human-Compatible XAI

Previously we gave an example of using dual-process theories to retrospectively understand how people interact with XAI. In this section we discuss another important human-centered approach to XAI, by performing theoretical analysis of human explanations, as well as broader cognitive and behavioral processes, to inspire new computational and design frameworks to make XAI more human compatible.

Such work is best represented by Miller’s seminal paper that brings insights from social sciences about fundamental properties of human explanations to common awareness of the AI community (Miller, 2018). By surveying a large volume of prior work on how people seek, generate, and evaluate explanations in philosophy, psychology, and cognitive science, Miller summarized four major properties of human explanations: 1) Explanations are often contrastive, sought in response to some counterfactual cases. This is because a Why question is often triggered by “abnormal or unexpected” events, not asked to understand the cause for an event per se, but the cause of an event relative to some other event that did not occur. In other words, the Why question is often an implicit Why not

question. 2) Explanations are selected, often in a biased manner. People rarely give an actual or complete cause of event, but select a small number of causes based on some criteria or heuristics. 3) Explanations are social, as a transfer of knowledge, often part of a conversation or interaction, and thus presented relative to the explainer’s beliefs about the explainee’s beliefs. 4) Using probabilities or statistical information to explain is often ineffective and unsatisfying. Explicitly referring to causes is more important.

Published in 2019, in just two years, this work has made significant impact on the XAI field. For instance, the point about explanations being contrastive has inspired many to work on counterfactual explanations to answer the Why Not or How to be That questions, as we reviewed in Section 2. From a user-interaction point of view, the points of explanations being selected and social have profound implications. Miller reviewed several useful theories about how people generate and present explanations to others, which we believe can provide conceptual ground to frame XAI as interaction problems. One of them is Malle’s theory of explanation (Malle, 2006), which breaks the generation of explanations into two distinct and co-influencing groups of psychological processes: 1) Information processes for the explainer (i.e. AI in the case of XAI) to devise explanations, which are determined by what kind of information the explainer has access to. 2) Impression management processes that govern the social interactions with the explainee (i.e., users in the case of XAI), which are driven by the pragmatic goal of the explainer, such as transferring knowledge, generating trust in an explainee, assigning blame, etc.

While currently under-explored, framing communication of explanations as an impression management process can inspire computational and design methods to make XAI effectively selected (and social), which can then mitigate the cognitive load and make XAI more consumable. A useful set of resources to inform XAI work on this topic, as Miller suggested, is to look at the cognitive processes for people to select explanations from available causes. Besides formal models of abductive reasoning, Miller also reviewed common heuristics people follow, such as abnormality (selecting the abnormal cause), intentionality (select intentional actions), necessity, sufficiency, and robustness (selecting causes that would hold in many situations). The choice highly depends on the explainer’s goal, which again highlights the importance of specifying the objective of explaining. Further, we point to broader social science research on impression management (Goffman and others, 1978; Leary and Kowalski, 1990), on influencing other’s perception by regulating information in social interactions, as well as ethics discussions around it, to draw inspiration from.

The social nature of explanation also maps to an essential requirement for interactivity in XAI applications (Krause et al., 2016). User interactions do not end at receiving an XAI output, but continue until an actionable understanding is achieved. In other words, as users’ explainability needs are expressed in questions, they will keep asking follow-up questions until satisfied and thus engage in a back-and-forth conversation. Therefore, conversational models of explanation, as well as general principles of conversations and communication (e.g., Grice’s maxims that a speaker follows to optimize for the desired social goal (Grice, 1975); Theory of grounding in communication (Clark and Brennan, 1991)), hold promises for informing technologies and design for interactive XAI. Miller reviewed several relevant theories including Hilton’s conversational model of explanations (Hilton, 1990), which postulates that a good explanation must be relevant to the focus of a question and present a topology of different causal questions. Antaki and Leudar extended this model to a wider class of argumentative dialogue for the common pattern of claim-backing in explanations (Antaki and Leudar, 1992). Walton further extended this line of work into a formal dialogue model of explanation (Walton, 2004), including a set of speech act rules. This kind of theories offer appealing grounds to build computational models, and recent XAI has begun to explore dialogue models for interactive XAI (Madumal et al., 2019). Outside XAI, work in dialogue systems frequently builds on formal models of human conversational and social interactions (e.g. (Bickmore and Cassell, 2001)), including systems that generate explanatory dialogues (Cawsey, 1992).

Theories can also inform design frameworks that guide researchers and practitioners to investigate the design space and make design choices. For example, Wang et al. preformed a comprehensive analysis on the theoretical underpinning of human reasoning and decision-making to derive a conceptual framework that allows linking XAI methods to users’ reasoning needs (Wang et al., 2019). This framework includes four dimensions that describe a normative view of how people should reason with explanations, including explanation goals, reasoning process, causal explanation type, and elements in rational choice decisions. It also separately describes people’s natural decision-making, and the errors and limitations they subject to, based on dual-process theories. Designers can use the framework to perform a conceptual analysis to understand, e.g., based on user research, users’ reasoning goals and potential errors, to identify what XAI methods can support their goals, or to investigate gaps in current XAI methods. The authors further provided a mapping between elements under these human-reasoning dimensions and existing XAI approaches, and guidelines on how to use XAI methods to mitigate common cognitive biases.

While hugely promising, theory-driven XAI is still a nascent area. Many areas of cognitive, social, and behavioral theories are yet to be explored. For example, if we center on users of XAI as information seekers to achieve actionable understanding, theories in information science such as models of sense-making (Dervin, 1998) and information seeking behaviors (Wilson, 1981) (how people’s information needs drive their behaviors and information use) can offer useful theoretical lenses to formalize and anticipate user behaviors. However, the major challenge lies in how to operationalize theoretical insights and formal behavioral models into computational and design frameworks, which may require, as many have already argued (Miller, 2018; Doshi-Velez and Kim, 2017; Vaughan and Wallach, 2020), collaboration across the research disciplines of AI, HCI, social sciences and more.

6. Summary

Explainable AI is one of the fastest-growing areas of AI in several directions: a rapidly expanding collection of techniques, substantial industry effort to produce open-source XAI toolkits for practitioners to use, and widespread public awareness and interest in the topic. It is also a fast-growing area for human-centered ML, which can be seen in a proliferation of XAI research published in HCI and social science venues in recent years. Adopting human-centered approaches to XAI is inevitable given that explainability is a human-centric property and XAI must be studied as an interaction problem. However, different from some other topics in this book, HCI work on XAI currently resides in, and often needs to challenge, a techno-centric reality given that the technical AI community have made large strides already. A research community of human-centered XAI 

(Ehsan et al., 2021c; Ehsan and Riedl, 2020; Wang et al., 2019) has emerged. In this chapter we provide a selected survey on work from this emerging community to encourage future research to continue bridging design practices and state-of-the-art XAI techniques, uncovering pitfalls of and challenging algorithmic assumptions, and building human-compatible XAI from theoretical grounds. We also hope these practiced approaches will inspire work to address broader challenges in human-centered ML.

Acknowledgements.
We thank Tim Miller and Upol Ehsan for their generous feedback. We are also grateful for members of Human-AI Collaboration group and Trustworthy AI department at IBM Thomas J. Watson Research Center, whose work and conversations with us shaped our thinking.

References

  • A. Abdul, C. von der Weth, M. Kankanhalli, and B. Y. Lim (2020) COGAM: measuring and moderating cognitive load in machine learning model explanations. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–14. Cited by: §4.
  • A. Adadi and M. Berrada (2018) Peeking inside the black-box: a survey on explainable artificial intelligence (xai). IEEE Access 6, pp. 52138–52160. Cited by: §1.
  • D. Alvarez-Melis and T. S. Jaakkola (2018) Towards robust interpretability with self-explaining neural networks. arXiv preprint arXiv:1806.07538. Cited by: §2.
  • C. Antaki and I. Leudar (1992) Explaining in conversation: towards an argument model. European Journal of Social Psychology 22 (2), pp. 181–194. Cited by: §5.
  • D. W. Apley and J. Zhu (2020)

    Visualizing the effects of predictor variables in black box supervised learning models

    .
    Journal of the Royal Statistical Society: Series B (Statistical Methodology) 82 (4), pp. 1059–1086. Cited by: Table 1.
  • M. Arnold, R. K. Bellamy, M. Hind, S. Houde, S. Mehta, A. Mojsilović, R. Nair, K. N. Ramamurthy, A. Olteanu, D. Piorkowski, et al. (2019) FactSheets: increasing trust in ai services through supplier’s declarations of conformity. IBM Journal of Research and Development 63 (4/5), pp. 6–1. Cited by: Table 1.
  • A. B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. García, S. Gil-López, D. Molina, R. Benjamins, et al. (2020) Explainable artificial intelligence (xai): concepts, taxonomies, opportunities and challenges toward responsible ai. Information Fusion 58, pp. 82–115. Cited by: §1, §3.
  • V. Arya, R. K. E. Bellamy, P. Chen, A. Dhurandhar, M. Hind, S. C. Hoffman, S. Houde, Q. V. Liao, R. Luss, A. Mojsilovic, S. Mourad, P. Pedemonte, R. Raghavendra, J. Richards, P. Sattigeri, K. Shanmugam, M. Singh, K. R. Varshney, D. Wei, and Y. Zhang (2020) AI Explainability 360: an extensible toolkit for understanding data and machine learning models. J. Mach. Learn. Res. 21 (130), pp. 1–6. Cited by: §1.
  • S. Bach, A. Binder, G. Montavon, F. Klauschen, K. Müller, and W. Samek (2015)

    On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation

    .
    PloS one 10 (7), pp. e0130140. Cited by: §2.
  • G. Bansal, T. Wu, J. Zhou, R. Fok, B. Nushi, E. Kamar, M. T. Ribeiro, and D. Weld (2021) Does the whole exceed its parts? the effect of ai explanations on complementary team performance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–16. Cited by: §4.
  • T. Bickmore and J. Cassell (2001) Relational agents: a model and implementation of building user trust. In Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 396–403. Cited by: §5.
  • Z. Buçinca, P. Lin, K. Z. Gajos, and E. L. Glassman (2020) Proxy tasks and subjective measures can be misleading in evaluating explainable ai systems. In Proceedings of the 25th International Conference on Intelligent User Interfaces, pp. 454–464. Cited by: §4, §4, §4.
  • Z. Buçinca, M. B. Malaya, and K. Z. Gajos (2021) To trust or to think: cognitive forcing functions can reduce overreliance on ai in ai-assisted decision-making. Proceedings of the ACM on Human-Computer Interaction 5 (CSCW1), pp. 1–21. Cited by: §4.
  • C. J. Cai, S. Winter, D. Steiner, L. Wilcox, and M. Terry (2019) Hello ai: uncovering the onboarding needs of medical practitioners for human-ai collaborative decision-making. Proceedings of the ACM on Human-Computer Interaction 3 (CSCW), pp. 104. Cited by: §4.
  • R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad (2015) Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. In Proceedings of KDD, Cited by: §2.
  • D. V. Carvalho, E. M. Pereira, and J. S. Cardoso (2019) Machine learning interpretability: a survey on methods and metrics. Electronics 8 (8), pp. 832. Cited by: §1.
  • A. Cawsey (1992) Explanation and interaction: the computer generation of explanatory dialogues. MIT press. Cited by: §5.
  • M. Chromik, M. Eiband, F. Buchner, A. Krüger, and A. Butz (2021) I think i get your point, ai! the illusion of explanatory depth in explainable ai. In 26th International Conference on Intelligent User Interfaces, pp. 307–317. Cited by: §4.
  • H. H. Clark and S. E. Brennan (1991) Grounding in communication.. Cited by: §5.
  • M. Craven and J. Shavlik (1995) Extracting tree-structured representations of trained networks. Advances in neural information processing systems 8, pp. 24–30. Cited by: Table 1.
  • B. Dervin (1998) Sense-making theory and practice: an overview of user interests in knowledge seeking and use. Journal of knowledge management. Cited by: §5.
  • A. Dhurandhar, P. Chen, R. Luss, C. Tu, P. Ting, K. Shanmugam, and P. Das (2018a) Explanations based on the missing: towards contrastive explanations with pertinent negatives. arXiv preprint arXiv:1802.07623. Cited by: §2, Table 1.
  • A. Dhurandhar, K. Shanmugam, R. Luss, and P. A. Olsen (2018b) Improving simple models with confidence profiles. Advances in Neural Information Processing Systems 31. Cited by: Table 1.
  • A. Dhurandhar, K. Shanmugam, and R. Luss (2020) Enhancing simple models by exploiting what they already know. In International Conference on Machine Learning, pp. 2525–2534. Cited by: §2.
  • J. Dodge, Q. V. Liao, Y. Zhang, R. K. Bellamy, and C. Dugan (2019) Explaining models: an empirical study of how explanations impact fairness judgment. In Proceedings of the 24th International Conference on Intelligent User Interfaces, pp. 275–285. Cited by: Figure 2, §4.
  • F. Doshi-Velez and B. Kim (2017) Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608. Cited by: §4, §5.
  • U. Ehsan, Q. V. Liao, M. Muller, M. O. Riedl, and J. D. Weisz (2021a) Expanding explainability: towards social transparency in ai systems. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–19. Cited by: Figure 3, §4.
  • U. Ehsan, S. Passi, Q. V. Liao, L. Chan, I. Lee, M. Muller, M. O. Riedl, et al. (2021b) The who in explainable ai: how ai background shapes perceptions of ai explanations. arXiv preprint arXiv:2107.13509. Cited by: §4, §4.
  • U. Ehsan and M. O. Riedl (2020) Human-centered explainable ai: towards a reflective sociotechnical approach. In International Conference on Human-Computer Interaction, pp. 449–466. Cited by: §1, §4, §6.
  • U. Ehsan, P. Tambwekar, L. Chan, B. Harrison, and M. O. Riedl (2019) Automated rationale generation: a technique for explainable ai and its effects on human perceptions. In Proceedings of the 24th International Conference on Intelligent User Interfaces, pp. 263–274. Cited by: §2.
  • U. Ehsan, P. Wintersberger, Q. V. Liao, M. Mara, M. Streit, S. Wachter, A. Riener, and M. O. Riedl (2021c) Operationalizing human-centered perspectives in explainable ai. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–6. Cited by: §1, §6.
  • M. Eiband, D. Buschek, A. Kremer, and H. Hussmann (2019) The impact of placebic explanations on trust in intelligent systems. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 1–6. Cited by: §4.
  • M. Eiband, H. Schneider, M. Bilandzic, J. Fazekas-Con, M. Haug, and H. Hussmann (2018) Bringing transparency design into practice. In 23rd international conference on intelligent user interfaces, pp. 211–223. Cited by: §3.
  • T. Gebru, J. Morgenstern, B. Vecchione, J. W. Vaughan, H. Wallach, H. Daumé III, and K. Crawford (2018) Datasheets for datasets. arXiv preprint arXiv:1803.09010. Cited by: Table 1.
  • B. Ghai, Q. V. Liao, Y. Zhang, R. Bellamy, and K. Mueller (2020) Explainable active learning (xal): an empirical study of how local explanations impact annotator experience. arXiv preprint arXiv:2001.09219. Cited by: §4.
  • S. Ghosh, Q. V. Liao, K. N. Ramamurthy, J. Navratil, P. Sattigeri, K. R. Varshney, and Y. Zhang (2021) Uncertainty quantification 360: a holistic toolkit for quantifying and communicating the uncertainty of ai. arXiv preprint arXiv:2106.01410. Cited by: Table 1.
  • E. Goffman et al. (1978) The presentation of self in everyday life. Vol. 21, Harmondsworth London. Cited by: §5.
  • A. Goldstein, A. Kapelner, J. Bleich, and E. Pitkin (2015) Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation. journal of Computational and Graphical Statistics 24 (1), pp. 44–65. Cited by: §2, Table 1.
  • H. P. Grice (1975) Logic and conversation. In Speech acts, pp. 41–58. Cited by: §5.
  • R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, and D. Pedreschi (2019) A survey of methods for explaining black box models. ACM computing surveys (CSUR) 51 (5), pp. 93. Cited by: §1, §2, §2.
  • K. S. Gurumoorthy, A. Dhurandhar, G. Cecchi, and C. Aggarwal (2019) Efficient data representation by selecting prototypes with importance weights. In 2019 IEEE International Conference on Data Mining (ICDM), pp. 260–269. Cited by: §2, Table 1.
  • [42] (2017) H2O.ai machine learning interpretability. Note: https://github.com/h2oai/mli-resources Cited by: §1, Table 1, §3.
  • P. Hase and M. Bansal (2020) Evaluating explainable ai: which algorithmic explanations help users predict model behavior?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5540–5552. Cited by: §4.
  • T. Hastie, R. Tibshirani, and J. Friedman (2009) The elements of statistical learnin. Cited on, pp. 33. Cited by: §2, Table 1.
  • D. J. Hilton (1990) Conversational processes and causal explanation.. Psychological Bulletin 107 (1), pp. 65. Cited by: §3, §5.
  • M. Hind, D. Wei, M. Campbell, N. C. Codella, A. Dhurandhar, A. Mojsilović, K. Natesan Ramamurthy, and K. R. Varshney (2019) TED: teaching ai to explain its decisions. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pp. 123–129. Cited by: §2.
  • M. Hind (2019) Explaining explainable ai. XRDS: Crossroads, The ACM Magazine for Students 25 (3), pp. 16–19. Cited by: §3.
  • F. Hohman, A. Head, R. Caruana, R. DeLine, and S. M. Drucker (2019) Gamut: a design probe to understand how data scientists understand machine learning models. In Proceedings of the 2019 CHI conference on human factors in computing systems, pp. 1–13. Cited by: §4.
  • S. R. Hong, J. Hullman, and E. Bertini (2020) Human factors in model interpretability: industry practices, challenges, and needs. Proceedings of the ACM on Human-Computer Interaction 4 (CSCW1), pp. 1–26. Cited by: §4.
  • [50] (2019) IBM aix 360. Note: aix360.mybluemix.net/ Cited by: §1, Table 1, §3.
  • D. Kahneman (2011) Thinking, fast and slow. Macmillan. Cited by: §4.
  • H. Kaur, H. Nori, S. Jenkins, R. Caruana, H. Wallach, and J. Wortman Vaughan (2020) Interpreting interpretability: understanding data scientists’ use of interpretability tools for machine learning. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–14. Cited by: §4, §4.
  • B. Kim, R. Khanna, and O. O. Koyejo (2016) Examples are not enough, learn to criticize! criticism for interpretability. In Proceedings of NIPS, Cited by: §2.
  • B. Kim, M. Wattenberg, J. Gilmer, C. Cai, J. Wexler, F. Viegas, et al. (2018)

    Interpretability beyond feature attribution: quantitative testing with concept activation vectors (tcav)

    .
    In International conference on machine learning, pp. 2668–2677. Cited by: §2.
  • J. Krause, A. Perer, and K. Ng (2016) Interacting with predictions: visual inspection of black-box machine learning models. In Proceedings of the CHI conference on human factors in computing systems, pp. 5686–5697. Cited by: §5.
  • H. Lakkaraju, S. H. Bach, and J. Leskovec (2016) Interpretable decision sets: a joint framework for description and prediction. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1675–1684. Cited by: §2.
  • H. Lakkaraju, E. Kamar, R. Caruana, and J. Leskovec (2017) Interpretable & explorable approximations of black box models. arXiv preprint arXiv:1707.01154. Cited by: §4.
  • M. R. Leary and R. M. Kowalski (1990) Impression management: a literature review and two-component model.. Psychological bulletin 107 (1), pp. 34. Cited by: §5.
  • J. Lei, M. G’Sell, A. Rinaldo, R. J. Tibshirani, and L. Wasserman (2018) Distribution-free predictive inference for regression. Journal of the American Statistical Association 113 (523), pp. 1094–1111. Cited by: Table 1.
  • J. Li, W. Monroe, and D. Jurafsky (2016) Understanding neural networks through representation erasure. arXiv preprint arXiv:1612.08220. Cited by: §2.
  • Q. V. Liao, D. Gruen, and S. Miller (2020) Questioning the ai: informing design practices for explainable ai user experiences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–15. Cited by: §2, Table 1, §3, §4.
  • Q. V. Liao, M. Pribić, J. Han, S. Miller, and D. Sow (2021) Question-driven design process for explainable ai user experiences. arXiv preprint arXiv:2104.03483. Cited by: §3.
  • B. Y. Lim and A. K. Dey (2009) Assessing demand for intelligibility in context-aware applications. In Proceedings of the 11th international conference on Ubiquitous computing, pp. 195–204. Cited by: §3.
  • Z. C. Lipton (2018) The mythos of model interpretability. Queue 16 (3), pp. 31–57. Cited by: §2.
  • A. V. Looveren and J. Klaise (2021) Interpretable counterfactual explanations guided by prototypes. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 650–665. Cited by: §2, Table 1.
  • A. Lucic, H. Haned, and M. de Rijke (2020) Why does my model fail? contrastive local explanations for retail forecasting. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 90–98. Cited by: §4.
  • S. M. Lundberg, G. Erion, H. Chen, A. DeGrave, J. M. Prutkin, B. Nair, R. Katz, J. Himmelfarb, N. Bansal, and S. Lee (2020) From local explanations to global understanding with explainable ai for trees. Nature machine intelligence 2 (1), pp. 56–67. Cited by: Table 1.
  • S. M. Lundberg and S. Lee (2017) A unified approach to interpreting model predictions. In Proceedings of the 31st international conference on neural information processing systems, pp. 4768–4777. Cited by: §2, Table 1.
  • P. Madumal, T. Miller, L. Sonenberg, and F. Vetere (2019) A grounded interaction protocol for explainable artificial intelligence. arXiv preprint arXiv:1903.02409. Cited by: §5.
  • B. F. Malle (2006) How the mind explains behavior: folk explanations, meaning, and social interaction. Mit Press. Cited by: §5.
  • [71] (2019) Microsoft interpretml. Note: hhttps://github.com/interpretml/interpret Cited by: §1, Table 1, §3.
  • T. Miller, P. Howe, and L. Sonenberg (2017) Explainable ai: beware of inmates running the asylum or: how i learnt to stop worrying and love the social and behavioural sciences. arXiv preprint arXiv:1712.00547. Cited by: §3.
  • T. Miller (2018) Explanation in artificial intelligence: insights from the social sciences. Artificial Intelligence. Cited by: §5, §5.
  • M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchinson, E. Spitzer, I. D. Raji, and T. Gebru (2019) Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency, pp. 220–229. Cited by: Table 1.
  • [75] (2018) Model interpretation with skater. Note: https://oracle.github.io/Skater/ Cited by: §1, Table 1, §3.
  • R. K. Mothilal, A. Sharma, and C. Tan (2019) Explaining machine learning classifiers through diverse counterfactual explanations. arXiv preprint arXiv:1905.07697. Cited by: §2, Table 1.
  • S. Narkar, Y. Zhang, Q. V. Liao, D. Wang, and J. D. Weisz (2021) Model lineupper: supporting interactive model comparison at multiple levels for automl. In 26th International Conference on Intelligent User Interfaces, pp. 170–174. Cited by: Figure 1, §4.
  • D. Norman (2013) The design of everyday things: revised and expanded edition. Basic books. Cited by: §1.
  • M. Nourani, C. Roy, J. E. Block, D. R. Honeycutt, T. Rahman, E. Ragan, and V. Gogate (2021) Anchoring bias affects mental model formation and user reliance in explainable ai systems. In 26th International Conference on Intelligent User Interfaces, pp. 340–350. Cited by: §4, §4.
  • A. Páez (2019) The pragmatic turn in explainable artificial intelligence (xai). Minds and Machines 29 (3), pp. 441–459. Cited by: §2.
  • R. E. Petty and J. T. Cacioppo (1986) The elaboration likelihood model of persuasion. In Communication and persuasion, pp. 1–24. Cited by: §4, §4.
  • F. Poursabzi-Sangdeh, D. G. Goldstein, J. M. Hofman, J. W. Wortman Vaughan, and H. Wallach (2021) Manipulating and measuring model interpretability. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–52. Cited by: §4.
  • A. Preece, D. Harborne, D. Braines, R. Tomsett, and S. Chakraborty (2018) Stakeholders in explainable ai. arXiv preprint arXiv:1810.00184. Cited by: §3.
  • I. Puri, A. Dhurandhar, T. Pedapati, K. Shanmugam, D. Wei, and K. R. Varshney (2021) CoFrNets: interpretable neural architecture inspired by continued fractions. In Advances in neural information processing systems, Cited by: §2.
  • C. Rastogi, Y. Zhang, D. Wei, K. R. Varshney, A. Dhurandhar, and R. Tomsett (2022) Deciding fast and slow: the role of cognitive biases in ai-assisted decision-making. In Proceedings of the ACM Conference on Computer Supported Cooperative Work and Social Computing, Cited by: §4, §4.
  • M. T. Ribeiro, S. Singh, and C. Guestrin (2016) Why should i trust you?: explaining the predictions of any classifier. In Proceedings of KDD, Cited by: §2, Table 1.
  • M. T. Ribeiro, S. Singh, and C. Guestrin (2018) Anchors: high-precision model-agnostic explanations. In Thirty-Second AAAI Conference on Artificial Intelligence, Cited by: Table 1, §4.
  • S. Y. Rieh and D. R. Danielson (2007) Credibility: a multidisciplinary framework. Annual review of information science and technology 41 (1), pp. 307–364. Cited by: §4.
  • J. Robertson, A. V. Kokkinakis, J. Hook, B. Kirman, F. Block, M. F. Ursu, S. Patra, S. Demediuk, A. Drachen, and O. Olarewaju (2021) Wait, but why?: assessing behavior explanation strategies for real-time strategy games. In 26th International Conference on Intelligent User Interfaces, pp. 32–42. Cited by: §4, §4.
  • C. Rudin (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1 (5), pp. 206–215. Cited by: §2, §2.
  • R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In

    Proceedings of the IEEE international conference on computer vision

    ,
    pp. 618–626. Cited by: §2.
  • A. Springer and S. Whittaker (2019) Progressive disclosure: empirically motivated approaches to designing effective transparency. In Proceedings of the 24th international conference on intelligent user interfaces, pp. 107–120. Cited by: §4, §4.
  • H. Suresh, S. R. Gomez, K. K. Nam, and A. Satyanarayan (2021) Beyond expertise and roles: a framework to characterize the stakeholders of interpretable machine learning and their needs. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–16. Cited by: §3.
  • M. Szymanski, M. Millecamp, and K. Verbert (2021) Visual, textual or hybrid: the effect of user expertise on different explanations. In 26th International Conference on Intelligent User Interfaces, pp. 109–119. Cited by: §4, §4.
  • S. Tan, R. Caruana, G. Hooker, P. Koch, and A. Gordo (2018) Learning global additive explanations for neural nets using model distillation. arXiv preprint arXiv:1801.08640. Cited by: §2.
  • J. W. Vaughan and H. Wallach (2020) A human-centered agenda for intelligible machine learning. Machines We Trust: Getting Along with Artificial Intelligence. Cited by: §2, §5.
  • S. Wachter, B. Mittelstadt, and C. Russell (2017) Counterfactual explanations without opening the black box: automated decisions and the gdpr. Cited by: §2, Table 1.
  • D. Walton (2004) A new dialectical theory of explanation. Philosophical Explorations 7 (1), pp. 71–89. Cited by: §5.
  • D. Wang, Q. Yang, A. Abdul, and B. Y. Lim (2019) Designing theory-driven user-centric explainable ai. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 601. Cited by: §1, §4, §5, §6.
  • X. Wang and M. Yin (2021) Are explanations helpful? a comparative study of the effects of explanations in ai-assisted decision-making. In 26th International Conference on Intelligent User Interfaces, pp. 318–328. Cited by: §4.
  • D. Wei, S. Dash, T. Gao, and O. Gunluk (2019) Generalized linear rule models. In International Conference on Machine Learning, pp. 6687–6696. Cited by: §2.
  • P. Wei, Z. Lu, and J. Song (2015) Variable importance analysis: a comprehensive review. Reliability Engineering & System Safety 142, pp. 399–432. Cited by: Table 1.
  • T. D. Wilson (1981) On user studies and information needs. Journal of documentation. Cited by: §5.
  • Y. Xie, M. Chen, D. Kao, G. Gao, and X. ‘. Chen (2020) CheXplain: enabling physicians to explore and understand data-driven, ai-enabled medical imaging analysis. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–13. Cited by: §4, §4.
  • Y. Zhang, Q. V. Liao, and R. K. Bellamy (2020) Effect of confidence and explanation on accuracy and trust calibration in ai-assisted decision making. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 295–305. Cited by: §4.