We are interested in algebras of languages, equipped with the constants empty language (), unit language (, the language containing only the empty word), the binary operations of union (), intersection (), and concatenation (), and the unary operations of Kleene star () and mirror image (). It is convenient in this paper to see the Kleene star as a derived operator with the operator representing the non-zero iteration. We call these algebras reversible Kleene lattices. Given a finite set of variables , and two terms built from variables and the above operations, we say that the equation is valid if the corresponding equality holds universally.
In a previous paper  we have presented an algorithm to test the validity of such equations, and shown this problem to be ExpSpace-complete. However, we had left open the question of the axiomatisation of these algebras. We address it now, by providing in the current paper a set of axioms from which every valid equation can be derived.
Several fragments of this algebra have been studied:
- Kleene algebra (KA):
- Kleene algebra with converse:
if we add to KA the mirror operation, then the previous theorem can be extended by switching to a duplicated alphabet, with a letter denoting the mirror of the letter . A small number of identities may be added to KA to get a complete axiomatisation .
- Identity-free Kleene lattices:
this algebra stems from the operators , , , and . In a recent paper  Doumane and Pous provided a complete axiomatisation of this algebra.
The present work is then an extension of identity-free Kleene lattices, by adding unit and mirror image. We provide in Table 1 a set of axioms which we prove to be complete for the equational theory of language algebra, by reducing to the the completeness theorem of . This proof has been formalised in Coq.
The paper is organised as follows. In Section 2, we introduce some notations and define the various types of expressions used in the paper. We present our axioms and state our main theorem. In Section 3 we deal with a technical lemma having to do with the treatment of the empty word. We proceed in Section 4 to extend the theorem of  with the mirror image operator. Section 5 studies in details terms of the algebra that are below the constant , as those play a crucial role in the main proof. We present the proof of our main result in Section 6. We conclude in Section 7 by a discussion on an operator that is missing from our signature, namely constant denoting the full language.
2.1 Sets, words, and languages
Given a set , we write for the set of subsets of and for the set of finite subsets of . We will denote the two-elements boolean set . For two sets , we write for their Cartesian product, for their union, and for their intersection. The empty set is denoted by . We will use the notation for a set and a function to represent the set .
Let be an arbitrary alphabet (set), the words over are finite sequences of elements from . The set of all words is written , and the empty word is written . The concatenation of two words is simply denoted by . The mirror image of a word , obtained by reading it backwards, is written . For instance is the word .
A language is a set of words, that is an element of . We will also use the symbol to denote the unit language . The concatenation of two languages and , denoted by , is obtained by lifting pairwise the concatenation of words: it contains exactly those words that can be obtained as a concatenation where . Similarly the mirror image of a language , denoted by , is the set of mirror images of words from . We write when and for the iterated concatenation, defined by induction on by and . The language is the union of all non-zero iterations of , i.e. .
2.2 Terms: syntax and semantics
Throughout this paper, we will consider expressions over various signatures which we list here. We fix a set of variables , and let range over .
- One-free expressions:
- Simple expressions:
We will use various sets of axioms, depending the signature. All of the axioms under consideration are listed in Table 1. Axioms in Tables 0(b) and 0(a) are borrowed from , those in Table 0(c) from  and Table 0(d) is inspired from . We use these axioms to generate equivalence relations over terms. For a type of expressions , the axiomatic equivalence relation, written is the smallest congruence on containing those axioms in Table 1 that only use symbols from the signature of . This means that for we use the axioms from Tables 0(b) and 0(a), for we add those from Table 0(c) and for we keep all of the axioms of Table 1. We will use the shorthand to mean . This ensures that is a partial order with respect to . We list in Table 2 some statements that are provable from the axioms. Interestingly the idempotency of (equation (1a.1)), which is usually an axiom, is here derivable from (0a.5) and (0a.8).
Given an expression , a set , and a map , we may interpret as a language over using the following inductive definition:
The semantic equivalence and semantic containment relations on , respectively written and , are defined as follows:
The main result of this paper is a completeness theorem for reversible Kleene lattices: [Main result]theoremcomplrkl . Since all of the axioms in Table 1 are sound for languages, we know that the implication from left to right holds. This paper will thus focus on the converse implication, and will proceed in several steps. Our starting point will be the recently published completeness theorem for identity-free Kleene lattices : . In , this theorem is established for interpretations of terms as binary relations instead of languages. However both semantic equivalences coincide for this signature .
3 A remark about the empty word
In several places in the proof, it makes some difference whether or not the empty word belongs to the language of some one-free expression. We show here one way one might manipulate this property, that will be of use later on. propositionrmnil Given an alphabet , a map and a set of variables , there is an alphabet , and two maps and such that:
We fix , , and as in the statement. Let be some fresh letter, we set to be . For a word , we define to be the word obtained by removing every instance of from . Finally, is defined as follows:
It is straightforward to check that for any variable . Therefore we only need to check that this property is preserved by the operators of one-free expressions. For any languages , the following distributivity laws hold:
However, it is not the case in general that . To make the induction go through, we will need to show that this identity holds for all the languages generated from the languages by the operations . This is achieved by identifying some sufficient condition for , and showing that this condition is satisfied by every language of the shape . Let us define the ordering on words over :
is a partial order and satisfies the following properties:
(3.1) and (3.4) tell us that each equivalence class of the relation forms a join-semilattice. The proofs of these properties being somewhat technical, we omit them here. The interested reader may refer to the Coq formalisation for details.
Consider now those languages over that are upwards-closed with respect to , that is to say languages such that whenever and , then . Clearly is closed for any variable . Since the property “being closed” is preserved by each operation in the signature of , we deduce that for any expression the language is closed.
Thankfully, for closed languages the missing identity holds. Thus we may conclude by induction on the expressions that . For the last step, notice that and . Since is closed, if then , thus . ∎
By setting the set in the previous proposition to the full set , we get the straightforward corollary, which will prove useful in the next section. Let be a one-free expression, then for any expression we have
4 Mirror image
In this section, we show a completeness theorem for one-free expressions. In order to get this result we will use translations between and . An expression is clean, written , if the mirror operator is only applied to variables. First, notice that we may restrict ourselves to clean expressions thanks to the following inductive function:
We can show by induction on terms the following properties of :
We now define translations between clean expressions and simple expressions:
replaces mirrored variables with and variables with ;
replaces with and with .
We can easily show by induction the following properties:
The last step to obtain the completeness theorem for is the following claim: .
If Section 4 holds, then .
Hence, we only need to show Section 4 to conclude. To that end, we show that for any clean expression , any interpretation of can be obtained by applying some transformation to some interpretation of . Thanks to Section 3, we may restrict our attention to interpretation that avoid the empty word. This seemingly mundane restriction turns out to be of significant importance: if the empty word is allowed, the proof of Section 4 becomes much more involved. More precisely, we prove the following lemma:
Let be some set and some interpretation such that . There exists an alphabet , an interpretation and a function such that: .
We fix and as in the statement. Like in the proof of Section 3, we set , with a fresh letter, and write for the word obtained from by erasing every occurrence of . Additionally we define the function as follows:
Clearly, and . We may now define and :
This is where the restriction comes in. Indeed a word cannot be written both as and as unless . Since does not contain the empty word, we may show that and .
distributes over the union and intersection operators. However, it does not hold in general that . Like in the proof of Section 3 we will therefore identify a predicate on languages that is sufficient for this identity to hold, is satisfied by , and is stable by . In this case we find that an adequate candidate is “ contains only valid words”, where the set of valid words is defined as follows:
Alternatively, the elements of are words over that can be written as a product with and each . One may see from the definitions that . can also be seen to be trivially closed by concatenation and mirror image. Since the remaining operators are either idempotent (union and intersection) or derived (iteration), we get that . This enables us to conclude thanks to the following property:
This property enables us to show that and , for languages of valid words . Hence we obtain by induction on expressions that for any term , it holds that . ∎
Thanks to Section 4, we only need to check Section 4. Let be two clean expressions such that , we want to prove . According to Section 3, we need to compare and for some such that . By Section 4, we may express these languages as respectively and . Since , we get that , thus proving the desired identity and concluding the proof. ∎
5 Interlude: tests
Before we start with the main proof, we define tests and establish a few result about them. Given a list of variables , we define the term by induction on as and . Thanks to the following remark, we will hereafter consider for : Let be two lists of variables containing the same letters (meaning a variable appears in if and only if it appears in ). Then .
The following property explains out choice of terminology: the function can be seen as a boolean predicate testing whether the empty word is in each of the for . Let be some alphabet and . Then either , in which case , or . Tests satisfy the following universal identities, with and :
We now want to compare tests with other tests or with expressions. Let us define the following interpretation for any finite set .
This enables us to establish the following lemma: For any , the following are equivalent:
Assume (i) holds, i.e. . By Section 5 this means that for every we have which by definition of ensures that . Thus we have shown that (ii) holds. We show that (ii) implies (iii) by induction on the size of :
if , by Equation 5.1 .
if with , since we have and . By induction hypothesis we know that . By Section 5 we get that . Hence we get:
We now define a function , whose purpose is to represent as a sum of tests the intersection of an arbitrary expression with :
We only need to show the implication from right to left. Assume . This implies , and since we know that which by soundness implies . Combining this with Section 5, we get that . By Section 5, we know that , which means that . Therefore there must be some such that which by Section 5 tells us that . We may now conclude:
The word “test” is reminiscent of Kleene algebra with tests (KAT). Indeed according to Equation 5.1 our tests are sub-units, like in KAT. However unlike in KAT, not every sub-unit is a test. Instead here sub-units are in general sums of tests, as can be inferred from Section 5 (because for every sub-unit , we have ).
6 Completeness of reversible Kleene lattices
To tackle this completeness proof, we will proceed in three steps. Since we already know soundness, and since an equality can be equivalently expressed as a pair of containments, we start from the following statement:
First, we will show that any expression in can be equivalently written as a sum of terms that are either tests or products of a test and a one-free expression. The case of tests having been dispatched already (Section 5), this reduces the problem to:
Second, we will show that for any pair , there exists an expression such that and whenever we have . This further reduces the problem into:
For the third and last step, we show that for any expression , there is an expression such that and whenever for we have . This is enough to conclude thanks to Section 4.
In the next three subsections, we introduce constructions and prove lemmas necessary for each step. Then, in Section 6.4 we put them all together to show the main result.
6.1 First step: normal forms
A normal form is either an expression of the shape or of the shape with . We denote by the set of normal forms. The main result of this section is the following: For any there exists a finite set such that .
We show by induction on how to build . The correctness of the construction is fairly straightforward, and is left as an exercise : we will only state the relevant proof obligations when appropriate.
For constants, variables, and unions, the choice is rather obvious:
The case of mirror image is also rather straightforward:
For concatenations, we define the product of two normal forms as:
We then define . For correctness of the construction, we would have to prove that .
For intersections, we define :
We then define .
Finally, for iterations we use the following definition:
In , a similar lemma was proved (Lemma 3.4). However, the proof in that paper is slightly wrong, as it fails to consider that cases (easy) and (more involved).
6.2 Second step: removing tests on the left
Here we want to transform an inequation , into one one the shape , while maintaining that . The construction of is fairly straightforward, the intuition being that forces us to only consider interpretations such that . Therefore, for any we replace in every occurrence of with . .
For the other property, we rely on the following lemma: Let be some alphabet, and be an interpretation such that . Then , where
The result follows from a straightforward induction, the only interesting case being that of variables . This case is a simple consequence of our definitions:
Let such that , then .
Since by Section 6.2 we have by soundness and transitivity of we have . We want to show that , so by Section 3 we only need to check that for any interpretation such that we have . If we take like in Section 6.2, we get that 1) since for every variable , and 2) since for every we have , we get . Together these tell us that . Since we know that , and by Section 6.2 we know . We may therefore conclude that . ∎
6.3 Third step: removing tests on the right
This last step relies on Section 3 and Section 6.1. For any expression , there exists a one-free expression such that and for any one-free expression such that we have . In other words, is the maximum of the set .
We define . We can easily check that :
For the other property, we rely on Section 3. Assume , we want to show that . By Section 3, it is enough to check that for interpretations such that . Let be such an interpretation, and some word such that . Notice that the condition on ensures that , hence implies that by Section 5. Also, because never contains the empty word and does not feature the constant , must be different from . Since , we already know that . By Section 6.1 and soundness, we know that there is a normal form such that . Since , cannot be a test: that would imply by (5.1) that , hence . Therefore we know that there is a term such that . This means that and . As we have noticed before, this means that . Thus we get and , which ensures that . ∎
6.4 Main theorem
We may now prove the main result of this paper:
Since and , we focus instead on proving that . By soundness we know that , so we only need to show the converse implication.
Let such that . By Section 6.1 we can show that . Let . Thanks to the properties of we have that . There are two cases for :
either for some , in which case we have by Section 5;
In both cases we have established that , so by monotonicity we show that