1 Introduction
Matching logic [35, 11] (hereafter shorthanded as ) is a logical framework which is used for specifying programming languages semantics [21, 23, 29] and for reasoning about programs [19, 18, 39, 25, 6]. In , the operational semantics of a programming language is used for both execution and verification of programs.
The best effort implementation of is the K framework. K aims to be an ideal framework where language designers should only formally specify the syntax and the semantics of their language. Then, the framework should automatically generate from the language formal specification a set of tools without additional efforts: a parser, an interpreter, a compiler, a model checker, a deductive verifier, etc. In the last few years, K is actively used on an industrial scale for the verification of programs that run on the blockchain (smart contracts), e.g., [30, 12]. Due to its complexity, a big challenge that K is currently facing is related to trustworthiness of the framework. Right now K has a half a million lines of code and it is hard to establish its correctness.
Recent research [9] tackles this trustworthiness issue by proposing an approach based on proof object generation. The key idea is to generate proof objects for the tasks performed by K and its autogenerated tools, and then use an external trustworthy proof checker to check the generated objects. This eliminates the need to trust the huge K implementation. So far, the authors of [9] focused on formalising program executions as mathematical proofs and generating their corresponding proof objects. This required the development of a proof generator which uses an improved proof system of [16], and a proof checker implementation in Metamath [27]. The K definition of a language corresponds to a theory which consists of a set of symbols (that represents the formal syntax of ) and a set of axioms (that specify the formal semantics of ). In [9], program executions are specified using formulas of the form:
where is the formula that specifies the initial state of the execution, is the formula that specifies the final state, and “” states the rewriting/reachability relation between states. The correctness of an execution,
is witnessed by a formal proof, which uses the proof system.
For a given execution, the K tool computes the proof parameters needed by the proof generator to generate the corresponding proof object. The set of parameters consists of the complete execution trace and the rewriting/matching information given by the rules applied together with the corresponding matching substitutions.
Generating proof objects for concrete program executions is an important step towards a trustworthy K framework. A more challenging task is to generate proof objects for symbolic executions. Symbolic execution is a key component in program verification and it has been used in K as well (e.g., [3, 24, 19]). The difficulty with symbolic execution is the fact that the parameters of an execution step must carry more proof information than in the concrete executions case. First, instead of matching, proof parameters must include unification information. Second, path conditions need to be carried along the execution.
In , there is a natural way to deal with symbolic execution. patterns have a normal form , where is a term (pattern) and is a constraint (predicate). In particular, can be the program state configuration and the path condition. Patterns are evaluated to the set of values that match and satisfy . To compute the symbolic successors of a pattern, say , with respect to a rule, say , we need to unify the patterns and . Because unification can be expressed as a conjunction in [4, 35], we can say that only the states matched by transit to states matched by . Expressing unification as a conjunction is an elegant feature of . However, in practice, unification algorithms are still needed to compute the symbolic successors of a pattern because they provide a unifying substitution. The symbolic successors are obtained by applying the unifying substitution to the right hand side of a rule (e.g., ) and adding the substitution (as an formula) to the path condition. Also, unification algorithms are being used to normalise conjunctions of the form , so that they consist of only one term and a constraint, . Therefore, the unification algorithms are parameters of the symbolic execution steps and they must be used to generate the corresponding proof objects. In [4] we present a solution to normalise conjunctions of term patterns using the syntactic unification algorithm [26]. The unification algorithm is instrumented to also generate a proof object for the equivalence between and .
It is often the case when more than one rule can be applied during symbolic execution. For instance, if an additional rule can be applied to , then the set of target states must match or . This set of states is matched by the disjunction . For the case when holds, the disjunction reduces to , which is not a normal form but it can be normalised using antiunification.
Contributions
This paper continues the current work in [9] and [4] towards a trustworthy semanticsbased framework for programming languages. We use Plotkin’s antiunification algorithm to (1) normalise disjunctions of term patterns and (2) to generate proof objects for the equivalence between disjunctions and their corresponding normal form. Each step performed by the antiunification algorithm on an input disjunction produces a formula which is equivalent to the disjunction. Each equivalence has a corresponding proof object. The generated proof objects (one for each equivalence) are assembled into a final proof object. We further provide a (3) prototype implementation of a proof generator and a (4) proof checker that we use for checking the generated objects.
Related work.
We use an applicative version of that is described in [13, 11]. This version of is shown to capture many logical frameworks, such as manysorted firstorder logic (MSFOL), constructors and terms algebras with the general induction and coinduction principles, MSFOL with least fixpoints, ordersorted algebras, etc. A proof system for that is more amenable for automation was proposed in [16]. The authors improve their previous work [14], by adding a contextdriven fixpoint rule in the proof system which can deal with goals that need wrapping and unwrapping contexts. This was inspired by the work on automation of inductive proofs (e.g., [22]) for Separation Logic [34].
K is a complex tool and ensuring the correctness of the generated tools is hard to achieve in practice. supplies an appropriate underlying logic foundation for K [15]. However, if K could output proofs that can be checked by an external proof checker (based on the existing proof system of Applicative ) then there is no need to verify K. Recent work shows that it is possible to generate proof objects for various tasks that the K generated tools perform (e.g., program execution, simplifications, unification). For instance, in [9], the authors tackle program executions: they are formalised as mathematical proofs, and then, their complete proof objects are generated. The proof objects can then be checked using a proof checker implemented in Metamath [27].
Unification and antiunification algorithms are used by the K tools, e.g., to handle the symbolic execution of programs written in languages formally defined in K [3, 24, 19]. The unification problem consists of finding the most general instance of two terms (see, e.g., [7]), which supplies the set of all common instances. This set of all common instances can represented in as the conjunction of the two terms [35]. In [4], we have shown that the syntactic unification algorithm [26] can be used to find a ”normal” form for the conjunction of two terms and to instrument it to generate a proof object for the equivalence. These transformations improve the efficiency of the prover implementation in K.
The antiunification problem is dual to unification and it consists of finding the most specific template (pattern) of two terms and it was independently considered by Plotkin [32] and Reynolds [33]. Antiunification is used, e.g., in [31] to generalize proofs, in [20] to typecheck pointcuts of an aspect, and in [40] for computing the least general type of two types. In this paper we use the antiunification algorithm for normalizing patterns of the form and for generating proof objects for the corresponding equivalence. The proof objects are generated using an approach based on “macro” level inference rules inspired from [28]. This is why we could not use the Metamath proof checker presented in [9, 16], and we developed our own proof checker. The classical untyped generalisation algorithm is extended to an ordersorted typed setting with sorts, subsorts, and subtype polymorphism [1, 8, 2]. We claim that the approach proposed in this paper can be smoothly extended to the ordersorted case.
Organisation
Section 2 briefly introduces and its proof system. Section 3 presents a specification that captures the term algebra up to an isomorphism. Only the case of nonmutual recursive sorts is considered. Section 4 includes the first main contribution, the representation of the antiunification in . A soundness theorem in terms for the Plotkin’s antiunification algorithm is proved. Section 5 describes the second main contribution, the algorithm generating proof objects for the equivalences given by the antiunification algorithm and its implementation in Maude, together with a Maude proof checker that certifies the generated proofs. We conclude in Section 6.
2 Matching Logic
Matching logic () [35] started as a logic over a particular case of constrained terms [36, 38, 18, 37, 39, 6, 25], but now it is developed as a full logical framework. We recall from [11, 13] the definitions and notions that we use in this paper.
Definition 1
A signature is a triple , where is a set of element variables , is a set of set variables , and is a set of constant symbols (or constants). The set of patterns is generated by the grammar below, where , , and :
The pattern is called definedness. For convenience, we introduce it directly in the syntax of patterns but it can be axiomatised (as in [35]). A pattern is positive in if all free occurrences of in are under an even number of negations. The pattern is an application and, by convention, it is left associative. We specify an signature only by when and are clear from the context.
Remark 1
The patterns below are derived constructs:
The pattern is called totality and it is defined using the definedness pattern. This totality pattern is used to define other mathematical instruments like equality and membership. The priorities of pattern constructs is given by this ordered list: , where has the highest priority and has the lowest priority. By convention, the scope of the binders extends as much as possible to the right, and parentheses can be used to restrict the scope of the binders.
Example 1
Let be an signature. Then , , , , , , , , are examples of patterns.
We write and to denote the pattern obtained by substituting all free occurrences of and , respectively, in for . In order to avoid variable capturing, we consider that renaming happens implicitly.
Patterns evaluate to the sets of elements that match them.
Definition 2
Given a signature , a model is a triple containing:

[topsep=0pt, partopsep=0pt, itemsep=0pt]

a nonempty carrier set ;

a binary function called application;

an interpretation for every constant as a subset of .
The application is extended to sets, , as follows: , for all .
Definition 3
Given a model , an valuation is a function with for all , and for all . The extension of is defined as:
, for all ;
, for all ;
, for all ;
;
;
;
;
;
, with , , and is the unique least fixpoint given by the KnasterTarski fixpoint theorem [41].
Note that the definedness pattern has a twovalue semantics: it can be either or . The patterns that evaluate to or are called predicates. In this category are included totality, equality and membership.
Remark 2
The evaluation is extended to the derived constructs as expected. For instance, we have and .
Example 2
Let be the signature in Example 1. A possible model is , where , if and otherwise, , if and , and otherwise. is the set of natural numbers, and is the set of lists of naturals, written as or with .
Since for any Mvaluation , , we obtain that the pattern matches the set . Following the same idea, it is easy to see that matches the singleton set , and matches . Moreover, the pattern matches the set .
Definition 4
We say that is valid in , and write , iff , for all . If is a set of patterns, then: iff , for all , and iff implies . An specification is a pair with a signature and a set of patterns.
2.0.1 The Matching Logic Proof System
The proof system we use is shown in Figure LABEL:fig:proofsystem, The first part is the Hilbertstyle proof system given in [11]. It contains four categories of rules: propositional tautologies, frame reasoning over application contexts, standard fixpoint reasoning, and two technical rules needed for the completeness result. In , an application context is a pattern with a distinguished placeholder variable such that the path from the root of to has only applications. is a shorthand for and denotes the set of free variables in .
Hilbertstyle proof system  
Propositional  , if is a propositional tautology over patterns  
Modus Ponens  φ_1 →φ_2  
Quantifier  
Generalisation  if  
Propagation  
Propagation  
Propagation  if  
Framing  
Set Variable Substitution  
PreFixpoint  
KnasterTarski  
Existence  
Singleton  
where are fresh variables. If we want to compute the lgg of and , we build the initial antiunification problem with and we apply Plotkin’s rule repeatedly.
When this rule cannot be applied anymore, we say that the obtained antiunification problem is in solved form. The obtained is the lgg of and , while defines the two substitutions and such that and . Note that the pairs are not commutative.
It has been proved in [32] that the above antiunification algorithm terminates and computes the lgg. In fact, for any input the algorithm computes all the generalisations of and , and it stops when it finds the lgg.
Lemma 1 ()Let and be two term patterns and such that . If and is in solved form, then is a generalisation of and , for all . Remark 3If is the input unification problem and is fresh w.r.t. the variables of and (i.e., ), then the Plotkin’s algorithm will generate only fresh variables w.r.t. the previously generated variables and . Each occurs at most once in the computed generalisation and at most once in every computed by the algorithm. Example 3Let us consider the term patterns and . Using Plotkin’s algorithm on the input (note that is fresh w.r.t. ) we obtain: The lgg of and is while the substitutions and satisfy and . The generated variables occur at most once in the computed lgg, and . 4.1 Antiunification representation inIn , the lgg of and is given by their disjunction , that is, the pattern that matches elements matched by or (cf. Remark 2). Disjunctions over term patterns are difficult to handle in practice, and thus, an equivalent normal form that has only one maximal structural component is convenient. We show that this form can be obtained using Plotkin’s antiunification algorithm. Instead of , we use their lgg, say , to capture the largest common structure of and . The computed substitutions and , with and , are used to build a certain constraint so that is equivalent to . Definition 6Let be a substitution. We denote by the predicate . Lemma 2 ()For all term patterns and , and for all substitutions such that , , and for all , we have . Example 4Remark 4Using Lemma 2, the disjunction is equivalent to , where is the lgg of and . The pattern has one structural component , but it appears twice! However, using a macro rule (i.e., Collapse from Figure 5) we obtain the equivalence . Moreover, since , we obtain (by transitivity) the equivalence . Example 5Recall the term patterns and from Example 3. Let be a generalisation of and with substitutions and . Using Lemma 2 and Collapse, . Let be another generalisation of and with and . Then . At each step, the Plotkin’s algorithm computes generalisations for and , until it reaches the lgg. Antiunification problems are encoded as patterns as below: Definition 7For each antiunification problem we define a corresponding pattern , where , , and . Example 6We show here the corresponding encodings for antiunification problems that occur in the execution of Plotkin’s algorithm in Example 3:
A key observation here is that these encondings are all equivalent. Theorem 4.1 shows that our approach is sound: Theorem 4.1 ()(Soundness) Let and be two term patterns and a variable such that . If , then . 5 Generating Proof Objects for AntiUnificationIn this section we describe how the antiunification algorithm can be instrumented in order to generate proof objects. The main idea we follow is as follows and it can be used for a larger class of termalgebrabased algorithms:
5.1 The Proof System Used for Proof GenerationThe structure of the proof certificates generated by our approach supplies a proof evidence of the algorithm execution. This is accomplished with the help of the additional macro rules shown in Figure 5. Proving these rules using the proof system in Figure LABEL:fig:proofsystem is out of scope of this paper. The soundness of these rules is proved in Appendix 7 and Appendix 8.

Comments
There are no comments yet.