The ability and need of humans to explain has been studied for centuries, initially in philosophy and more recently also in all those sciences aiming at a better understanding of (human) intelligence. Measuring the degree of explainability of AI systems has become relevant in the light of research progress in the eXplainable AI (XAI) field, the proposal for an EU Regulation on Artificial Intelligence, and ongoing standardisation initiatives that will translate these technical advancements in a de facto regulatory standard for AI systems. To date, standardisation entities have proposed white papers and preliminary documents showing their progress , among them we mention: the European Telecommunications Standards Institute (ETSI), the CEN-CENELEC, and ISO/IEC TR 24028:2020(E), stating that ’[i]t is important also to consider the measurement of the quality of explanations’ and provides for details on the key measurements (i.e. continuity, consistency, selectivity; paras 9.3.6, 9.3.7).
Considering that, since ISO/IEC TR 24028:2020(E), the literature has started to propose new metrics and mechanisms, with this work we study and categorise the existing approaches to quantitatively assess the quality of explainability in Machine Learning and AI. We do so through the lenses of law and philosophy, not just computer science. This last characteristic is certainly our main contribution to the literature ofXAI and Law, and we believe it may foster future research to embrace an interdisciplinary approach less timidly, for the sake of a better conformity to existing (and new) regulations in the EU panorama.
This paper is structured as follows. In Section 2 and 3 we present the research background and the methodology of this paper. Then in Section 4, 5 and 6, we explore the definitions and properties of explainability in philosophy and in the proposed AI Act. Finally, in Section 7 and 8 we perform an analysis of the existing quantitative metrics of explainability, discussing our findings and future research.
2 Related Work
In XAI’s literature there are many interesting surveys on explainability techniques [guidotti2018survey, adadi2018peeking, arrieta2020explainable, zhou2021evaluating]
, classifying algorithms on different dimensions to help researchers in finding the more appropriate ones for their own work. Practically, all these surveys focus on a classification of the mechanisms to achieve explainability rather than how to measure the quality of it, and we believe our work can help in this latter.
For example [guidotti2018survey] classify XAI methods with respect to the notion of explanation and the type of black-box system. The identified characteristics are respectively the level-of-detail of explainability (from high to low: global logic, local decision logic, model properties) and the level of interpretability of the original model. Similarly to [guidotti2018survey], also [adadi2018peeking] study XAI considering interpretability and level-of-detail.
On the other hand, [zhou2021evaluating] focus specifically on the metrics to quantify the quality of explanation methods, classifying them according to the properties they can measure and the format of explanations (model-based, attribution-based, example-based) they support. More precisely, [zhou2021evaluating] narrow down the survey to the functionality-grounded metrics, proposing for them a new taxonomy including interpretability (in terms of clarity, broadness, and parsimony) and fidelity (as completeness, and soundness).
Among all the identified surveys, [zhou2021evaluating] is certainly the closest to our work, in terms of focus of the survey. The main distinction between our work and [zhou2021evaluating]
is probably the assumption we do that multiple definitions of explainability exist, each one possibly requiring its own type of metrics. Furthermore, differently from[zhou2021evaluating], we analyse explainability metrics on their ability to meet the requirements set by the AI Act.
We performed an exploratory literature review of existing metrics to measure the explainability of AI-related explanations, together with a qualitative legal analysis of the explainability requirements to understand the alignment of the identified metrics to the expectations of the proposed AI Act.
To do so, we collected all the papers cited in [zhou2021evaluating], re-classifying them.
Then we integrated with further works identified through an in depth keyword-based research111 The main keywords we used were “degree of explainability”, “explainability metrics”, “explainability measures”, and “evaluation metrics for contrastive explanations”.
The main keywords we used were “degree of explainability”, “explainability metrics”, “explainability measures”, and “evaluation metrics for contrastive explanations”.on Google Scholars, Scopus, and Web Of Science. On the other hand, the legal analysis was carried out on the proposed Artificial Intelligence Act. Considering the lack of case law and the paucity of studies on this novel piece of legislation, a literal assessment of its provisions has been preferred to more critical analysis based on previous enquiries.
4 Definitions of Explainability
Considering the definition of “explainability” as “the potential of information to be used for explaining”, we envisage that a proper understanding of how to measure explainability must pass through a thorough definition of what constitutes an explanation and the act of explaining.
In 1948 Hempel and Oppenheim published their “Studies in the Logic of Explanation” [hempel1948studies], giving birth to what it is considered the first theory of explanation, the deductive-nomological model. After that date, many attempts followed to amend, extend or replace this first model, which is considered fatally flawed [bromberger1966questions, salmon1984scientific]. This gave birth to several competing and more contemporary theories of explanations [mayes2005theories]: i) Causal Realism, ii) Constructive Empiricism, iii) Ordinary Language Philosophy, iv) Cognitive Science, v) Naturalism and Scientific Realism. A summary of these definitions is shown in Table 1.
Interestingly, each one of these theories devises different definitions of “explanation”. If we look at their specific characteristics we may find that all but Causal Realism are pragmatic. On the other hand, Causal Realism and Constructive Empiricism are rooted on causality, while the others not 222They study the act of explaining as an iterative process involving broader forms of question answering. Nonetheless, Cognitive Science and Scientific Realism are more focused on the effects that an explanation has on the explainee (the recipient of the explanation).
Importantly, with the present letter, we assert that whenever explaining is considered to be a pragmatic act, explainability differs from explaining. In fact, pragmatism in this sense is achieved when the explanation is tailored to the specific user, so that the same explainable information can be presented and re-elaborated differently across users. It follows that for each philosophical tradition, but Causal Realism, we have a definition of “explainable information” that slightly differs from that of “explanation”, as shown in Table 1.
5 Explainability Desiderata
In philosophy, the most important work about the central criteria of adequacy of explainable information is likely to be Carnap’s [leitgeb2021carnap]. Even though Carnap studies the concept of explication rather than that of explainable information, we assert that they share a common ground making his criteria fitting in both cases. In fact, explication in Carnap’s sense is the replacement of a somewhat unclear and inexact concept (the explicandum) by a new, clearer, and more exact concept called explicatum, and that is exactly what information does when made explainable.
Carnap’s central criteria of explication adequacy are [leitgeb2021carnap]: similarity, exactness and fruitfulness333Carnap also discussed another desideratum, simplicity, but this criterion is presented as being subordinate to the others.. Similarity means that the explicatum should be similar to the explicandum, in the sense that at least many of its intended uses, brought out in the clarification step, are preserved in the explicatum. On the other hand, Exactness means that the explication should, where possible, be embedded in some sufficiently clear and exact linguistic framework. While Fruitfulness means that the explicatum should be used in a high number of other good explanations (the more, the better).
Carnap’s adequacy criteria seem to be transversal to all the identified definitions of explainability, possessing preliminary characteristics for any piece of information to be considered properly explainable. Therefore, our interpretation of Carnap’s criteria in terms of measurements is the following.
Similarity is about measuring how much similar
the given information is to the explanandum. This can be estimated by counting the number ofrelevant aspects covered by information and the amount of details it can provide.
Exactness is about measuring how clear the given information is, in terms of pertinence and syntax, regardless its truth. Differently from Carnap, our understanding of exactness is broader than that of adherence to standards of formal concept formation [brun2016explication].
Fruitfulness is about measuring how much a given piece of information is going to be used in the generation of explanations. Consequently, each one of the explainability definitions may define fruitfulness differently.
Importantly, the property of truthfulness (being different from exactness) is not explicitly mentioned in Carnap’s desiderata. That is to say that explainability and truthfulness are complementary, but different, as discussed also by [hilton1996mental]. In fact an explanation is such regardless its truth (wrong but high-quality explanations exist, especially in science). Vice-versa, highly correct information can be very poorly explainable.
6 Explainability Obligations in the Proposed AI Act
Following the EU Commission’s Proposal for an Artificial Intelligence Act (AIA), it is now time to discuss how explainability is connected to the novel obligations introduced by the Act. In fact, considering the nature and the characteristics of the requirements posed by the AIA, it is worth questioning how explainability metrics could be designed to fulfil the necessities of all the entities whose behaviour will be regulated by the AIA.
The discussion towards “explainability and law” has departed from the contested existence of a right to explanation in the General Data Protection Regulation (GDPR) [wachter2017right, selbst2018meaningful] to embrace contract, tort, banking law [hacker2021varieties], and judicial proceedings [ebers2020regulating]. Differently from other domains, the AIA is specific to AI systems and requires an ad hoc discussion rather than the framing of these systems in the discussion of other legal domains. This is because AI technologies are not placed within an existing legal framework (e.g. banking), but the whole legal framework (i.e. the AIA) is built around AI technologies. However, the previous discussion focusing on other legal regimes constitutes a valuable background for our research and thus it contributes to our discussion. The interpretations proposed by recent commentators [ebers2020regulating, hacker2021varieties] identify several nuances of algorithmic transparency. Our focus, however, shall be confined to the interaction between the nuance of explainability and obligations emerging from the AIA already identified by these early commentators.
As regards the GDPR, scholars have extensively discussed whether or not the right to receive an explanation for “solely automated decision-making” processes exists in the GDPR 444 Regardless of the answer, the data controller has an obligation to provide “meaningful information about the logic involved” in the automated decision. See art. 13(2)(f), art. 14(2)(g), art. 15(1)(h). . Then, the discussion identified a “technical” necessity of explainability, that is necessary to improve the accuracy of the model. In legal terms, it is echoed by the “protective” transparency that is needed to minimise risks and comply with certain legal regimes (tort law and contractual obligations). As with data protection law, these varieties are instrumental to improve a product and protect its users or the persons affected by the system from damages. If explainability is often instrumental to achieve some legislative goals, it is likely that it could be meant to foster certain regulatory purposes also under the AIA. From the joint reading of a series of provisions, it will be argued that explainability in the AIA is both user-empowering and compliance-oriented: on the one hand, it serves to enable users of the AI system to use it correctly; on the other hand, it helps to verify adequacy to the many obligations set by the AIA.
Recital 47 and art. 13(1) state that high-risk AI systems shall be designed and developed in such a way that their operation is comprehensible by the users. They should be able a) to interpret the system’s output and b) to use it in an appropriate manner. This is a form of user-empowering explainability. Then, the second part of Art. 13 specifies that “an appropriate type and degree of transparency shall be ensured, with a view to achieving compliance (emphasis added) with the relevant obligations of the user and of the provider […]”. In our reading, this provision specifies that this explainability obligations (i.e. transparent design and development of high-risk AI systems) is compliance-oriented. The twofold goal of art. 13(1) is then echoed by other provisions. As regards the user-empowering interpretation, art. 14(4)(c) relates explainability to “human oversight” design obligations. These measures should enable the individual supervising the AI system to correctly interpret its output. Moreover, this interpretation shall put him or her in the position to decide whether it might be the case to “disregard, override or revers the output”, art. 14(4)(d).
The compliance-oriented explainability interpretation becomes evident in the technical documentation to be provided according to Article 11. Compliance is based on a presumption of safety if the system is designed according to technical standards (Art. 40) to which adherence is documented, whereas third-party assessment appears only post-market or on specific sectors (see Chapter IV). The contents of the dossier are those detailed by Annex IV. Inter alia, Annex IV(2)(b) include “the design specifications of the system, namely the general logic of the AI system and of the algorithms” among the information to be provided to show compliance with the AIA before placing the AI system in the market. Hence, the system should be explainable in a manner that allows an evaluation of conformity by the provider in the first instance and, when necessary, by post-market monitoring authorities. Since the general approach taken by the proposed AIA is a risk-reduction mechanism (Recital 5), this form of explainability is ultimately meant to contribute to minimising the level of potential harmfulness of the system.
User-empowering and compliance oriented explainability overlap in art. 29(4).When a risk is likely to arise, the user shall suspend the use of the system and inform the provider or the distributor. This provision entails the capability of understanding the working of the system (real-time) and making previsions on its output. Suspending in the case of likely risk is the overlapping between the two nuances of explainability: the user is empowered to stop the AI system to avoid contradicting the rationale behind the AIA, i.e. risk-minimisation.
Once clarified the existence of explainability obligations and their extent, let us discuss the requirements that metrics should have to ease compliance with the AIA. Let us remind that, under the proposal, adopting a standard means certifying the degree of explainability of a given AI system. Therefore, metrics become useful in the course of the standardisation process: i) ex ante, when defining the explainability measures adopted by the standard; ii) ex post, when verifying in practice the adoption of a standard.
From these premises it follows that, in the light of the purposes of the AIA, any explainability metric should be at minimum: i) Risk-focused, ii) Model-agnostic, iii) Goal-Aware, iv) and Intelligible & accessible.
Risk-focused means that the metric should be functional to measure the extent to which the explanations provided by the system allows for an assessment of the risks to the fundamental rights and freedoms of the persons affected by the system’s output. This is necessary to ensure both user-enabling (e.g. art. 29) and compliance-oriented (Annex IV) explainability. While Model-agnostic means that the metric should be appropriate to all the AI systems regulated by the AIA555Annex I provides a list of the AI techniques and approaches that fall within the remit of the Regulation..
Goal-aware means that the metric should be flexible towards the different needs of the potential explainees (i.e. AI system providers and users, standardisation entities, etc.)666Since it might be hard to determine ex ante the nature, the purpose, and the expertise of the explainee, the metrics should consider the highest possible number of potential explainees. and applicable in all the high-risk AI applications listed in Annex III. While Intelligible & accessible means that if information on the metrics is not accessible (e.g. due to intellectual property reasons) or the results of a metric are not reproducible (e.g. due to a subjective evaluation), explainees will confront with a situation of uncertainty, as an ignotum per ignotius. This would contradict the risk minimisation principle.
7 Discussing Existing Quantitative Measures of Explainability
In this section we identify some pros and cons of existing metrics (and measures) to quantitatively estimate the degree of explainability of information, with the aim of understanding their range of applicability across different needs and interpretations of explainability. We do it by performing a qualitative classification of these measures based on Carnap’s desiderata, the theories of explanation presented in Section 4 and the main principles identified in Section 6.
More precisely, in Table 2 we classified the metrics on the following dimensions: the format of information supported by the metric (i.e. rule-based, example-based, natural language text, etc.); the supporting theory of the metric (i.e. cognitive science, constructive empiricism, etc.); subjectivity(whether the metric requires evaluations given by humans subjects); the covered criteria of adequacy. Then, in Table 3 we aligned the supporting theories (hence also the metrics) to the properties identified with the analysis of the AI Act carried out in Section 6.
Doing so, we considered only a part of the dimensions adopted by [zhou2021evaluating]. More precisely, we kept clarity, broadness and completeness, aligning the first two to Carnap’s exactness and the latter to similarity. In fact, we deemed soundness to be as truthfulness, a complementary characteristic to explainability and not a characteristic of explainability, as discussed in Section 5. While broadness and parsimony were considered as characteristics to achieve pragmatic explanations rather than properties of explainability.
Furthermore, differently from ISO/IEC TR 24028:2020(E) we did not focus on metrics specific to ex-post feature attribution explanations, so we selected methods possibly applicable also on ex-ante or more generic types of explanations.
As shown in Table 2
, we were able to find at least one example of metric for each supporting philosophical theory , with a majority of metrics focused on Causal Realism and Cognitive Science. What is common to all the metrics based on Cognitive Science is that they require humans subjects for performing the measurement, therefore they tend to be more expensive than the others, at least in terms of human effort. Furthermore, the metrics proposing heuristics to measure all Carnap’s desiderata are just two, one for Causal Realism[lakkaraju2017interpretable] and the other for Ordinary Language Philosophy [sovrano2021metric]. Interestingly, [lakkaraju2017interpretable] evaluates the three desiderata separately, while [sovrano2021metric] propose a single metric combining all of them.
Finally, the results shown in Table 3 indicate that the metrics supported by both Causal Realism and Constructive Empiricism might struggle at being model-agnostic and goal-aware, this probably limits their applicability to very specific contexts.
8 Final Remarks
With this work we proposed an interdisciplinary analysis of explainability metrics in Artificial Intelligence. More specifically, through the lens of the obligations enshrined by the proposed Act, we identified that explainability metrics should be risk-focused, model-agnostic, goal-aware, intelligible & accessible. We found that these characteristics pose some constraints on the scope of explainability metrics, suggesting that different metrics may be complementary, serving different roles, depending on the context. In fact, as shown in Table 3, while the majority of supporting theories have the potential to result in risk-focused metrics, some of them might have important issues with goal-awareness, intelligibility and accessibility.
Nonetheless, our analysis of these metrics was qualitative and not quantitative. In fact, all of the considered metrics were tested by their authors on very specific applications and technologies, raising the issue of whether they can be seemingly effective under different implementation scenarios. Hence, we envisage that a more quantitative analysis should be carried on, perhaps by defining a proper benchmark on which metrics can be thoroughly evaluated from a legal perspective.
Therefore, we believe that more academic contributions and new benchmarks for quantitative legal analysis are needed, to better understand the pros and cons of existing technologies, for any standardisation process to be finalised and effectively deployed in the EU panorama. For example, considering the current level of discussion and that our findings might be subject to change due to the institutional debate about the Proposal, further research is needed at least to consolidate the interpretation of the Act in the light of its future changes.