Explanation by Automated Reasoning Using the Isabelle Infrastructure Framework

12/29/2021
by   Florian Kammüller, et al.
Middlesex University London
0

In this paper, we propose the use of interactive theorem proving for explainable machine learning. After presenting our proposition, we illustrate it on the dedicated application of explaining security attacks using the Isabelle Infrastructure framework and its process of dependability engineering. This formal framework and process provides the logics for specification and modeling. Attacks on security of the system are explained by specification and proofs in the Isabelle Infrastructure framework. Existing case studies of dependability engineering in Isabelle are used as feasibility studies to illustrate how different aspects of explanations are covered by the Isabelle Infrastructure framework.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

01/04/2020

A Formal Development Cycle for Security Engineering in Isabelle

In this paper, we show a security engineering process based on a formal ...
12/08/2021

Dependability Engineering in Isabelle

In this paper, we introduce a process of formal system development suppo...
02/20/2022

Automated Reasoning in Non-classical Logics in the TPTP World

Non-classical logics are used in a wide spectrum of disciplines, includi...
01/08/2021

A Rewriting Logic Approach to Specification, Proof-search, and Meta-proofs in Sequent Systems

This paper develops an algorithmic-based approach for proving inductive ...
03/26/2020

Applying the Isabelle Insider Framework to Airplane Security

Avionics is one of the fields in which verification methods have been pi...
12/13/2020

Explanation from Specification

Explainable components in XAI algorithms often come from a familiar set ...
12/18/2020

Effectiveness of SCADA System Security Used Within Critical Infrastructure

Since the 1960s Supervisory Control and Data Acquisition (SCADA) systems...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Proposing Interactive Theorem Proving for Explainable Machine Learning

Machine Learning (ML) is everywhere in Computer Science now. One may almost say that all of Computer Science has now become a part of ML and is viewed as a technique within the greater realm of Data Science or Data Engineering. But while this major trend like many other trends prevails, we should not forget that Artificial Intelligence (AI) is the original goal of what was the starting point of machine learning and that Automated Reasoning has been created as a means to provide for artificial intelligent systems a mechanical way of imitating human reasoning by implementing logics and automatizing proof.

When we think of how to explain why a specific solution for a problem is a solution, the purest way to do so is to explain it by way of mathematically precise arguments – which is equivalent to providing a logically sound proof in a mathematical model of the solution domain or context. An ML algorithm would do the same, for example, by providing a decision tree to explain a solution, but usually the ML explanations which are generated by the ML model itself are very close to the ML implementation. So, they often fail to give a satisfactory, i.e. human understandable explanation.

This paper shows our point of view on a tangible way forward to combining interactive theorem proving with machine learning (ML). Different from the main stream of using ML to improve automated verification, we propose an integration at a higher level, using logical modeling and automated reasoning for explainability of machine learning solutions. The main idea of our proposal is based on one major fact about logic and proof:

Reasoning is not only a very natural way of explanation but it is also the most complete possible one since it provides a mathematical proof on a formal model.

In the spirit of this thought, we provide a proof of concept on a framework that has been established for security and privacy analysis, the Isabelle Infrastructure framework. In this paper, we thus first introduce this framework by summarizing its basic conepts and various applications (SectionII). After contrasting to some other conceptual approaches to ML and theorem proving including explanation (Section III), we briefly sketch our conceptual proposal (Section IV)

Ii Isabelle Infrastructure Framework

The Isabelle Infrastructure framework is implemented as an instance of Higher Order Logic in the interactive generic theorem prover Isabelle/HOL [1]. The framework enables formalizing and proving of systems with physical and logical components, actors and policies. It has been designed for the analysis of insider threats. However, the implemented theory of temporal logic combined with Kripke structures and its generic notion of state transitions are a perfect match to be combined with attack trees into a process for formal security engineering [2] including an accompanying framework [3].

Ii-1 Kripke structures, CTL and Attack Trees

A number of case studies have contributed to shape the Isabelle framework into a general framework for the state-based security analysis of infrastructures with policies and actors. Temporal logic and Kripke structures are deeply embedded into Isabelle’s Higher Order logic thereby enabling meta-theoretical proofs about the foundations: for example, equivalence between attack trees and CTL statements have been established [4] providing sound foundations for applications. This foundation provides a generic notion of state transition on which attack trees and temporal logic can be used to express properties for applications. The logical concepts and related notions thus provided for sound application modeling are:

  • Kripke structures and state transitions:
    A generic state transition relation is ; Kripke structures over a set of states t reachable by from an initial state set I can be constructed by the Kripke constructor as

    Kripke {t. \(\exists\) i \(\in\) I. i \(\to_{i}^{*}\) t} I
    

  • CTL statements:
    We can use the Computation Tree Logic (CTL) to specify dependability properties as

    K \(\vdash\)  EF s
    
    This formula states that in Kripke structure K there is a path (E) on which the property s (given as the set of states in which the property is true) will eventually (F) hold.

  • Attack trees:
    attack trees are defined as a recursive datatype in Isabelle having three constructors: creates or-trees and creates and-trees. And-attack trees and or-attack trees consist of a list of sub-attacks which are themselves recursively given as attack trees. The third constructor takes as input a pair of state sets constructing a base attack step between two state sets. For example, for the sets I and s this is written as . As a further example, a two step and-attack leading from state set I via si to s is expressed as

    \(\vdash\) [\({\cal{N}}_{\texttt{(I,si)}}\),\({\cal{N}}_{\texttt{(si,s)}}\)]\(\oplus_{\wedge}^{\texttt{(I,s)}}\)
    

  • Attack tree refinement, validity and adequacy:
    Attack trees can be constructed also by a refinement process but this differs from the system refinement presented in the paper [5]. An abstract attack tree may be refined by spelling out the attack steps until a valid attack is reached:

    A :: (:: state) attree).

    The validity is defined constructively so that code can be generated from it. Adequacy with respect to a formal semantics in CTL is proved and can be used to facilitate actual application verification. This is used for the stepwise system refinements central to the methodology called Refinement-Risk cycle developed for the Isabelle Infrastructure framework [5].

A whole range of publications have documented the development of the Isabelle Insider framework. The publications [6, 7, 8] first define the fundamental notions of insiderness, policies, and behaviour showing how these concepts are able to express the classical insider threat patterns identified in the seminal CERT guide on insider threats [9]. This Isabelle Insider framework has been applied to auction protocols [10, 11] illustrating that the Insider framework can embed the inductive approach to protocol verification [12]. An Airplane case study [13, 14] revealed the need for dynamic state verification leading to the extension of adding a mutable state. Meanwhile, the embedding of Kripke structures and CTL into Isabelle have enabled the emulation of Modelchecking and to provide a semantics for attack trees [15, 16, 17, 4, 3]. Attack trees have provided the leverage to integrate Isabelle formal reasoning for IoT systems as has been illustrated in the CHIST-ERA project SUCCESS [2] where attack trees have been used in combination with the Behaviour Interaction Priority (BIP) component architecture model to develop security and privacy enhanced IoT solutions. This development has emphasized the technical rather than the psychological side of the framework development and thus branched off the development of the Isabelle Insider framework into the Isabelle Infrastructure framework. Since the strong expressiveness of Isabelle allows to formalize the IoT scenarios as well as actors and policies, the latter framework can also be applied to evaluate IoT scenarios with respect to policies like the European data privacy regulation GDPR [18]. Application to security protocols first pioneered in the auction protocol application [10, 11]

has further motivated the analysis of Quantum Cryptography which in turn necessitated the extension by probabilities

[19, 20, 21].

Requirements raised by these various security and privacy case studies have shown the need for a cyclic engineering process for developing specifications and refining them towards implementations. A first case study takes the IoT healthcare application and exemplifies a step-by-step refinement interspersed with attack analysis using attack trees to increase privacy by ultimately introducing a blockchain for access control [3]. First ideas to support a dedicated security refinement process are available in a preliminary arxive paper [22] but only the follow-up publication [23] provides the first full formalization of the RR-cycle and illustrates its application completely on the Corona-virus Warn App (CWA). The earlier workshop publication [24] provided the formalisation of the CWA illustrating the first two steps but it did not introduce the fully formalised RR-cycle nor did it apply it to arrive at a solution satisfying the global privacy policy [5].

Iii Machine Learning, Explanation and Theorem Proving

If theorem proving could automatically be solved by machine learning, we would solve the P=NP problem [25]. Nevertheless, ML has been successfully employed within theorem provers to enhance the decision processes. Also in Isabelle, the sledgehammer tool uses ML mainly to select lemmas.

A very relevant work by Vigano and Magazzeni [26] focuses the idea of explainability on security, coining the notion of XSec or Explainable Security. The authors propose a full new research programme for the notion of explainability in security in which they identify the “Six Ws” of XSec: Who? What? Where? When? Why? And hoW? They position their paper clearly into the context of some earlier works along the same lines, e.g. [27, 28], but go beyond the earlier works by extending the scope and presenting a very concise yet complete description of the challenges. As opposed to XAI in general, the paper shows how already in understanding explanations only for the focus area of security (as opposed to all application domains of IT) is quite a task. Also they point out that XAI is merely concerned with explaining the technical solution provided by ML, whereas XSec looks at various other levels most prominently, the human user, by addressing domains like usable security and security awareness, and security economics [26][p. 294].

Our point of view is quite similar to Vigano’s and Magazzeni’s but we emphasize the technical side of explanation using interactive theorem proving and the Isabelle Infrastructure framework, while they focus on differentiating the notion of explanation from different aspects, for example, stake holders, system view, and abstraction levels. However, the notion of refinement defined for the process of dependability engineering for the Isabelle Infrastructure framework [5] allows addressing most of the Six Ws, because our model includes actors and policies and allows differentiation between insider and outsider attacks, expression of awareness [23]. Thus, we could strictly follow the Ws when explaining our proposition but we believe it is better to contemplate the Ws simply in the context of classical Software Engineering that has similar Ws. Moreover, the Refinement-Risk cycle of dependability engineering can be seen as specification refinement framework that employs the classical AI technique of automated reasoning. Surely, the human aspect versus the system aspect on the Sic Ws of XSec brings in various different view points but these are inherent in if the contexts, that are needed for the interpretation are present in the model. Otherwise, they simply have to be added to it, for example, by using refinement to integrate these aspects of reality into the model. Then the Isabelle Infrastructure framework allows explanation for various purposes, audiences, technical levels (HW/SW). policies, localities and other physical aspects. Thus, we can answer all Six Ws and argue that is what human centric software, security, and dependability engineering are all about.

Moreover, despite contrasting from the approach by Vigano and Megazzini, we follow the classical engineering approach of Fault-tree analysis, more concretely using Attack Trees, and propose a dual process of attack versus security protection goal analysis which in itself offers a direct input to ML, for example to produce features that could be used for Decision trees as well as metrics that could provide feedback for optimization techniques as used in reinforcement learning.

Iv Explaining (not only) Security by the Isabelle Infrastructure framework

This section describes the core ideas of explanation provided by applying the Isabelle Infrastructure framework.

Iv-a State transition systems and attack trees as a dual way of explanation

One important aspect of explanation that is not restricted to security at all is to provide a step by step trace of state transitions to explain how a specific state may appear. This can explain where a problem lies, for example, to explain how an ML algorithm arrived at a decision for a medical diagnosis by lining up a number of steps that lead to it.

In the Isabelle Infrastructure framework the notion of state transition systems is provided as a generic theory based on Kripke structures to represent state graphs over arbitrary types of states and using the branching time logic CTL to express temporal logical formulas over them. The correspondance between the CTL formulas of reachability and attack trees and the proof of adequacy are suitable to allow for a dual step by step analysis of a system dove-tailing the fault analysis with a specification refinement. This dove-tailing process leads to an elaborate process not only of explaining faults of system designs and how they can be reached practically by a series of actions but also an explanation of additional features of a system that are motivated by the detected fault. For example, when it comes to human awareness and usable security an explanation of a necessary security measure that is imposed on a user can be readily illustrated by an attack graph or its equivalent attack path that can be readilty produced by the adequacy theory.

Iv-B Human and Locality Aspects

The Isabelle Infrastructure framework has initially been designed to be merely focused on modeling and analyzing Insider threats before it became extended into what is now known as the Isabelle Infrastructure framework. Due to this initial motivation the framework explicitly supports the notion of human actors within networks of physical and virtual locations. These aspects are important to model various different stake holders to enable explanations to different audiences having different view points and needing different levels of detail and complexity in their explanations. For example, the explanation of a security threat will have a substantially different form if produced for a security analyst of to a system end user. Due to the explicit representation of human actors as well as their locations and other variable features, the Isabelle Infrastructure framework supports a fine grained control over the definition of applications thus enabling very flexible support of explanation about human aspects and suited to human understanding.

Also the human aspect necessitates consideration of the human condition, in particular psychological characterizations. The Isabelle Infrastructure framework, by augmenting the Isabelle Insider framework, provides for such characterization. For example, when considering insiderness, the state of the insider is characterized by a predicate that allows to use this state within a logical analysis of security and privacy threats to a system. Although these characterizations are axiomatic in the sense that the definition of the insider predicate is based on empirical results that have been externally input into the specification, it is in principle feasible to enrich the cognitive model of the human in the Isabelle Insider framework. A first step towards that has been done by experimenting with an extension of a notion of human awareness to support additionally analysis of unintentional insiders for human unawareness of privacy risks in social media [23].

Iv-C Dependability Engineering: Specifying Protection Goals and Quantifying Attackers

The process of Dependability Engineering – the Refinement-Risk (RR) cycle – conceived for the Isabelle Infrastructure framework [22] allows a human centric system specification to be refined step-by-step following an iteration of finding faults within a system specification and refining this specification by more sophisticated data types or additional rules or changes to the semantics of system functions. The data type refinement allows integrating for example, more restrictive measures to control data, for example, using blockchains to enhance data consistency, or data labeling for access controzl. This refinement is triggered by previously found flaws in the system and thus provides concrete motivations for such design decisions leading to constructive explanations. Similarly, additional constraints on rules that are introduced in a refinement step of the RR-cycle are motivated by previously found attacks, for example, the necessity to change the ephemeral id of every user when they move to a new location instantaneously at moving time for the Corona-Warn-App is motivated by an identification attack [24, 5].

Since the RR-cycle is based on the idea of refinement, another requirement for a flexible explanation comes in for free: if we want to explain to different audiences or at different technical levels, we equally need to refine (or abstract) definitions of data-types, rules for policies, or descriptions of algorithms. The Isabelle Infrastructure framework directly supports these expressions at different abstraction levels and from different view points.

Iv-D Quantification

An important aspect is quantification for explanation. Very often an explanation will not be possible in a possibilistic way. A quantification could be given by adding probabililies as well as other quanitative data, like costs, to explanations. For example, for a security attack the cost that an attacker is estimated to invest maximally on a specific attack step is an inevitable ingredient for a realistic attacker model. Simlarly, the likelihood of a successful attack of a certain attack step could be needed for an analysis. Attack trees support these types of quantification. Naturally, the Isabelle Infrastructure framework also supports them. The application to the security analysis of Quantum Cryptography, i.e., the modeling and analysis of the Quantum Key Distribution protocol (QKD) lead to the extension for probabilistic state transition systems

[19, 20, 21].

Quantification can also be a useful explanation for the process of learning for example by quantifying a distance to an attack goal. In that sense, quantified explanation can be a useful feedback for machine learning itself.

V Conclusions

In this paper we have proposed the sue of Automated Reasoning in the particular instance of the Isabelle Infrastructure framework for Explanation. We summarized the work that lead to the creation of the Isabelle Infrastructure framework highlighting the existing applications and extensions. After studying some related work on explanation, we provided a range of conceptual points that argued why and how the Isabelle Infrastructure framework supports explanation.

References

  • [1] T. Nipkow, L. C. Paulson, and M. Wenzel, Isabelle/HOL – A Proof Assistant for Higher-Order Logic, ser. LNCS.   Springer-Verlag, 2002, vol. 2283.
  • [2] CHIST-ERA, “Success: Secure accessibility for the internet of things,” 2016, http://www.chistera.eu/projects/success.
  • [3] F. Kammüller, “Combining secure system design with risk assessment for iot healthcare systems,” in Workshop on Security, Privacy, and Trust in the IoT, SPTIoT’19, colocated with IEEE PerCom.   IEEE, 2019.
  • [4] ——, “Attack trees in isabelle,” in 20th International Conference on Information and Communications Security, ICICS2018, ser. LNCS, vol. 11149.   Springer, 2018.
  • [5] ——, “Dependability engineering in isabelle,” 2021, arxiv preprint, http://arxiv.org/abs/2112.04374.
  • [6] F. Kammüller and C. W. Probst, “Invalidating policies using structural information,” in IEEE Security and Privacy Workshops, Workshop on Research in Insider Threats, WRIT’13, 2013.
  • [7] ——, “Combining generated data models with formal invalidation for insider threat analysis,” in IEEE Security and Privacy Workshops, Workshop on Research in Insider Threats, WRIT’14, 2014.
  • [8] ——, “Modeling and verification of insider threats using logical analysis,” IEEE Systems Journal, Special issue on Insider Threats to Information Security, Digital Espionage, and Counter Intelligence, vol. 11, no. 2, pp. 534–545, 2017. [Online]. Available: http://dx.doi.org/10.1109/JSYST.2015.2453215
  • [9] D. M. Cappelli, A. P. Moore, and R. F. Trzeciak, The CERT Guide to Insider Threats: How to Prevent, Detect, and Respond to Information Technology Crimes (Theft, Sabotage, Fraud), 1st ed., ser. SEI Series in Software Engineering.   Addison-Wesley Professional, Feb. 2012. [Online]. Available: http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20&path=ASIN/0321812573
  • [10] F. Kammüller, M. Kerber, and C. Probst, “Towards formal analysis of insider threats for auctions,” in 8th ACM CCS International Workshop on Managing Insider Security Threats, MIST’16.   ACM, 2016.
  • [11] ——, “Insider threats for auctions: Formal modeling, proof, and certified code,” Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications (JoWUA), vol. 8, no. 1, 2017. [Online]. Available: http://doi.org/10.22667/JOWUA.2017.03.31.044
  • [12] L. C. Paulson, “The inductive approach to verifying cryptographic protocols,” Journal of Computer Security, vol. 6, no. 1-2, pp. 85–128, 1998.
  • [13] F. Kammüller and M. Kerber, “Investigating airplane safety and security against insider threats using logical modeling,” in IEEE Security and Privacy Workshops, Workshop on Research in Insider Threats, WRIT’16.   IEEE, 2016.
  • [14] ——, “Applying the isabelle insider framework to airplane security,” Science of Computer Programming, vol. 206, 2021. [Online]. Available: https://doi.org/10.1016/j.scico.2021.102623
  • [15] F. Kammüller, “A proof calculus for attack trees,” in Data Privacy Management, DPM’17, 12th Int. Workshop, ser. LNCS, vol. 10436.   Springer, 2017, co-located with ESORICS’17.
  • [16] ——, “Human centric security and privacy for the iot using formal techniques,” in 3d International Conference on Human Factors in Cybersecurity, ser. Advances in Intelligent Systems and Computing, vol. 593.   Springer, 2017, pp. 106–116, affiliated with AHFE’2017.
  • [17] ——, “Formal models of human factors for security and privacy,” in 5th International Conference on Human Aspects of Security, Privacy and Trust, HCII-HAS 2017, ser. LNCS, vol. 10292.   Springer, 2017, pp. 339–352, affiliated with HCII 2017.
  • [18] ——, “Formal modeling and analysis of data protection for gdpr compliance of iot healthcare systems,” in IEEE Systems, Man and Cybernetics, SMC2018.   IEEE, 2018.
  • [19] ——, “Qkd in isabelle – bayesian calculation,” arXiv, vol. cs.CR, 2019. [Online]. Available: https://arxiv.org/abs/1905.00325
  • [20] ——, “Attack trees in isabelle extended with probabilities for quantum cryptography,” Computer & Security, vol. 87, 2019. [Online]. Available: //doi.org/10.1016/j.cose.2019.101572
  • [21] ——, “Formalizing probabilistic quantum security protocols in the isabelle infrastructure framework,” informal Presentation at Computability in Europe, CiE 2019. [Online]. Available: https://www.aemea.org/CIE2019/CIE_2019_Abstracts.pdf#page=35
  • [22] F. Kammüller, “A formal development cycle for security engineering in isabelle,” 2020, arxiv preprint, http://arxiv.org/abs/2001.08983.
  • [23] F. Kammüller and C. M. Alvarado, “Exploring rationality of self awareness in social networking for logical modeling of unintentional insiders,” 2021, arxiv preprint, http://arxiv.org/abs/2111.15425.
  • [24] F. Kammüller and B. Lutz, “Modeling and analyzing the corona-virus warning app with the isabelle infrastructure framework,” in 20th International Workshop of Data Privacy Management, DPM’20, ser. LNCS, vol. 12484.   Springer, 2020, co-located with ESORICS’20.
  • [25]

    D. Windridge and F. Kammüller, “Edit distance kernelization of np theorem proving for polynomial-time machine learning of proof heuristics,” in

    Future of Information and Communications Conference, FICC 2019, ser. Advances in Information and Communication. Lecture Notes in Networks and Systems, vol. 70.   Springer, 2019.
  • [26] L. Viganó and D. Magazzeni, “Explainable security,” in IEEE European Symposium on Security and Privacy Workshops, EuroS&PW.   IEEE, 2020.
  • [27] G. Bender, L. Kot, and J. Gehrke, “Explainable security for relational databases,” in International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22-27, 2014, C. E. Dyreson, F. Li, and M. T. Özsu, Eds.   ACM, 2014, pp. 1411–1422. [Online]. Available: https://doi.org/10.1145/2588555.2593663
  • [28] W. Pieters, “Explanation and trust: What to tell the user in security and ai?” Ethics and Information Technology, vol. 13, no. 1, pp. 53–64, 2011.