1 Introduction
To handle real problems, formal methods should be capable of describing the different facets of a system: data structures, sequential algorithms, concurrency, real time, probabilistic and stochastic aspects, hybrid systems, etc. In the present article, we address the two former points. In most case studies, the data structures and their algorithms are relatively simple, the most complex ones being trees, which are explored using breadthfirst or depthfirst traversals, etc. Contrary to such commonplace examples, cryptographic functions exhibit more diverse behaviour, as they rather seek to perform irregular computations than linear ones.
To explore this dimension, we consider the Message Authenticator Algorithm (MAA, for short), a pioneering cryptographic function designed in the mid80s at the National Physical Laboratory (NPL, United Kingdom). The MAA was adopted in two international standards (ISO 8730 and ISO 87312) and served, between 1987 and 2001, to secure the authenticity and integrity of banking transactions. The MAA also played a role in the history of formal methods, as the NPL developed, in the early 90s, three formal specifications of the MAA in VDM, Z, and LOTOS abstract data types.
The present article revives these early efforts by examining, twentyfive years later, how the new generation of formal methods can cope with the MAA case study. The article is organized as follows. Section 2 presents the MAA from both an historical and technical perspective. Section 3 introduces the eight formal specifications of the MAA we are aware of. Section 4 discusses some key modelling issues that arise when specifying the MAA. Section 5 precises how the formal specifications have been validated and which issues have been uncovered. Section 6 gives concluding remarks. Annexes A and B
report errors found in the MAA test vectors prescribed by ISO standards 8730 and 87312. Finally, Annexes
C and D provide two formal specifications of the MAA in LOTOS and LNT, which are novel contributions.2 The Message Authenticator Algorithm (MAA)
In data security, a Message Authentication Code (MAC) is a short sequence of bits that is computed from a given message; the MAC ensures both the authenticity and integrity of the message, i.e., that the message sender is the stated one and that the message contents have not been altered. A MAC is more than a mere checksum, as it must be secure enough to defeat attacks; its design usually involves cryptographic keys shared between the message sender and receiver. One of the first MAC algorithms to gain widespread acceptance was the MAA, which we now present in more detail.
2.1 History of the MAA
The MAA was designed in 1983 by Donald Watt Davies and David Clayden at NPL, in response to a request of the UK Bankers Automated Clearing Services [4] [3]. Its authors were formerly involved in the detailed design and development of Pilot ACE (Automatic Computing Engine), an early computer based on original designs of Alan Turing. Donald Watt Davies (1924–2000) is a founding father of computer science, also well known for his pioneering work on computer networks and packet switching in the mid60s^{1}^{1}1Biographic information about D. W. Davies can be found from http://en.wikipedia.org/wiki/DonaldDavies and http://thelinuxmaniac.users.sourceforge.net/docs/be/chc61.. Shortly after its design, the MAA became standardized at the international level in two complementary ISO banking standards:

The ISO international standard 8730 (published in 1986 [15] and revised in 1990 [17]) specifies methods and procedures for protecting messages exchanged between financial institutions. Such a protection is based on secret keys symmetrically shared between these institutions and on the computation of a MAC for each message exchanged.
The 1986 version of this standard [15] was independent from any particular algorithm for MAC computation. Such independence was slightly undermined by the 1990 revision of this standard [17], which added two annexes D and E providing test vectors (i.e., MAC values for a few sample messages and given keys) computed using two specific algorithms (DEA and MAA) presented hereafter. A technical corrigendum was later issued in 1999 [19] to address the Year2000 problem, without any impact of the MAC computation itself.

The ISO international standard 8731 has two distinct parts, each devoted to an approved algorithm for MAC computation that can be used in the security framework specified by ISO 8730. Both algorithms are mutually exclusive, in the sense that using only one of them is deemed to be sufficient for authenticating messages:

Part 1 (i.e., ISO 87311) describes the DEA (Data Encryption Algorithm) which is a CBCMAC (Cipher Block Chaining Message Authentication Code) based on the DES standard cryptographic algorithm. The DEA is not addressed in the present article.

Later, cryptanalysis of MAA revealed several weaknesses, including feasible bruteforce attacks, existence of collision clusters, and keyrecovery techniques [30] [31] [34] [28] [33] [32]. After such discoveries, MAA ceased to be considered as secure enough and was withdrawn from ISO standards in 2002 [29].
2.2 Overview of the MAA
Nowadays, Message Authentication Codes are computed using different families of algorithms based on either cryptographic hash functions (HMAC), universal hash functions (UMAC), or block ciphers (CMAC, OMAC, PMAC, etc.). Contrary to these modern approaches, the MAA was designed as a standalone algorithm that does not rely on any preexisting hash function or cipher.
In this section, we briefly explain the principles of the MAA. More detailed explanations can be found in [3], [5] and [24, Algorithm 9.68].
The MAA was intended to be implemented in software and to run on 32bit computers. Hence, its design intensively relies on 32bit words (called blocks) and 32bit machine operations.
The MAA takes as inputs a key and a message. The key has 64 bits and is split into two blocks and . The message is seen as a sequence of blocks. If the number of bytes of the message is not a multiple of four, extra null bytes are added at the end of the message to complete the last block. The size of the message should be less than 1,000,000 blocks; otherwise, the MAA result is said to be undefined; we believe that this restriction, which is not inherent to the algorithm itself, was added in the ISO 87312 standard to provide MAA implementations with an upper bound (four megabytes) on the size of memory buffers used to store messages.
The MAA produces as output a block, which is the MAC value computed from the key and the message. The fact that this result has only 32 bits proved to be a major weakness enabling cryptographic attacks; MAC values computed by modern algorithms now have a much larger number of bits. Apart from the aforementioned restriction on the size of messages, the MAA behaves as a totallydefined function; its result is deterministic in the sense that, given a key and a message, there is only a single MAC result, which neither depends on implementation choices nor on hidden inputs, such as nonces or randomlygenerated numbers.
The MAA calculations rely upon conventional 32bit logical and arithmetic operations, among which: AND (conjunction), OR (disjunction), XOR (exclusive disjunction), CYC (circular rotation by one bit to the left), ADD (addition), CAR (carry bit generated by 32bit addition), MUL (multiplication, sometimes decomposed into HIGH_MUL and LOW_MUL, which denote the most and leastsignificant blocks in the 64bit product of a 32bit multiplication). On this basis, more involved operations are defined, among which MUL1 (result of a 32bit multiplication modulo ), MUL2 (result of a 32bit multiplication modulo ), MUL2A (faster version of MUL2), FIX1 and FIX2 (two unary functions^{2}^{2}2The names FIX1 and FIX2 are borrowed from [25, pages 36 and 77]. respectively defined as and , where A, B, C, and D are four hexadecimal block constants A = 02040801, B = 00804021, C = BFEF7FDF, and D = 7DFEFBFF). The MAA operates in three successive phases:

The prelude takes the two blocks and of the key and converts them into six blocks , , , , , and . This phase is executed once. After the prelude, and are no longer used.

The main loop successively iterates on each block of the message. This phase maintains three variables , , and (initialized to , , and , respectively), which are modified at each iteration. The main loop also uses the value of , but neither nor .

The coda adds the blocks and at the end of the message and performs two more iterations on these blocks. After the last iteration, the MAA result, noted , is .
In 1987, the ISO 87312 standard [16, Sect. 5] introduced an additional feature (called mode of operation), which concerns messages longer than 256 blocks (i.e., 1024 bytes) and which, seemingly, was not present in the early MAA versions designed at NPL. Each message longer than 256 blocks must be split into segments of 256 blocks each, with the last segment possibly containing less than 256 blocks. The above MAA algorithm (prelude, main loop, and coda) is applied to the first segment, resulting in a value noted . This block is then inserted before the first block of the second segment, leading to a 257block message to which the MAA algorithm is applied, resulting in a value noted . This is done repeatedly for all the segments, the MAA result computed for the th segment being inserted before the first block of the th segment. Finally, the MAC for the entire message is the MAA result computed for the last segment.
2.3 Informal Specifications of the MAA
We consider the 1988 NPL technical report [5] to be the reference document for the MAA definition in natural language. Indeed, this technical report is freely available from the NPL library or can be downloaded from the web, whereas the (withdrawn) ISO standards 8730 and 87312 need to be purchased. The algorithm described in [5] is identical to the MAA definition given in ISO 87312.
Moreover, [5] provides the source code of two different implementations of the MAA, in BASIC (227 lines^{3}^{3}3In the present paper, when counting lines of code, we exclude blank lines, comments, as well as predefined libraries. Concerning the MAA implementation in BASIC, we also exclude all PRINT statements, mostly intended for debugging purpose.) and C (182 lines^{4}^{4}4We exclude all prinf statements, as well as five nonessential functions (menu1, inmain, mainloop1, fracsec, and guesstime), only retaining case 5 in the main function, and unfolding multiple instructions present on the same lines.). None of these implementations supports the aforementioned “mode of operation”; we therefore added 31 lines of C code implementing this missing functionality. Although the C code was written in 1987 for the Turbo C and Zorland compilers, it still compiles and executes properly today after a few simple corrections, provided that long integers are set to 32 bits^{5}^{5}5For instance, using GCC with options m32 std=c90..
2.4 Test Vectors for the MAA
There are two official sources of test vectors for the MAA:

[5, Sections 15 to 20] provides a series of tests vectors contained in six tables, which can also be found in [18, Annex A]. These test vectors specify, for a few given keys and messages, the expected values of intermediate calculations (e.g., MUL1, MUL2, MUL2A, prelude, main loop, etc.) and the expected MAA results for the entire algorithm. The “mode of operation” is not tested as the messages considered contain either 2 or 20 blocks, i.e., less than 256 blocks.

Another series of test vectors that take into account the “mode of operation” can be found in [17, Annex E]. More precisely, Annex E.3.3 gives expected values for an execution of the prelude, Annex E.3.4 gives results for an 84block message, and Annex E.4 gives results for a 588block message.
3 Formal Specifications of the MAA
As far as we know, not less than eight formal specifications have been produced for the MAA. We present each of them in turn, drawing a clear distinction between nonexecutable and executable specifications. To unambiguously designate these specifications, we adopt the following naming convention: LANGXX refers to the formal specification written in language LANG during year XX.
3.1 NonExecutable Formal Specifications
For cryptographic protocols, an informal specification is often not precise enough, and the MAA makes no exception. For instance, G. I. Parkin and G. O’Neill devoted four pages in [26, Sect. 3] and [27, Sect. 3] to discuss all possible interpretations of the MAA definition in natural language. The need for unambiguous specifications was certainly felt by stakeholders, as three formal specifications of the MAA were developed at NPL in the early 90s, as part of a comparative study in which common examples were modelled using various formal methods. All these specifications were nonexecutable, in the sense that MAA implementations had to be developed manually and could not be automatically derived from the formal specifications — at least, using the software tools available at that time. Let us briefly review these specifications:

VDM90 : In 1990, G. I. Parkin and G. O’Neill designed a formal specification of the MAA in VDM [26] [27]. To our knowledge, their work was the first attempt at applying formal methods to the MAA. This attempt was clearly successful, as the VDM specification became a (nonauthoritative) annex in the 1992 revision of the ISO standard defining the MAA [18, Annex B]. This annex is concise (9 pages, 275 lines) and its style is close to functional programming. Due to the lack of VDM tools, its correctness could only be checked by human proofreading. Three implementations in C [26, Annex C], Miranda [26, Annex B], and Modula2 [22] were written by hand along the lines of this VDM specification.

Z91 : In 1991, M. K. F. Lai formally specified the MAA using the settheoretic Z notation. Based upon Knuth’s “literate programming” approach, this work resulted in a 57page technical report [21], in which formal fragments (totalling 608 lines of Z code) are inserted in the naturallanguage description of the MAA. This formal specification was designed to be as abstract as possible, not to constrain implementations unduly, and it was lightly verified using a typechecking tool.

LOTOS91 : In 1991, Harold B. Munster produced another formal specification of the MAA in LOTOS presented in a 94page technical report [25]^{6}^{6}6This report and its LOTOS specification are available online from ftp://ftp.inrialpes.fr/pub/vasy/publications/others/Munster91a.pdf and ftp://ftp.inrialpes.fr/pub/vasy/demos/demo12/LOTOS/maaoriginal.lotos, respectively.. This specification (16 pages, 438 lines) uses only the data part of LOTOS (namely, abstract data types inspired from the ACT ONE language [8] [23]), importing the predefined LOTOS libraries for octets, strings, natural numbers in unary, binary, and decimal notation; the behavioural part of LOTOS, which serves to describe concurrent processes, is not used at all. This specification is mostly declarative, and not directly executable, for at least two reasons:

Many computations are specified using the predefined type Nat that defines natural numbers in unary notation, i.e., numbers built using the two Peano constructor operations and . On this basis, the usual arithmetic operations (addition, multiplication, etc.) are defined equationally. In practice, such a simple encoding for Nat cannot feasibly implement the large 32bit numbers manipulated in MAA calculations.

The full expressiveness of LOTOS equations is used in an unconstrained manner, making it necessary at many places to invert nontrivial userdefined functions. For instance, given a conditional equation of the form , evaluating requires to compute . Such situations arise, in a more involved way, with and , and , and , and , and , etc.
Interestingly, such executability issues are not discussed in [25]. Instead, the report stresses the intrinsic difficulty of describing partial or incompletelyspecified functions in LOTOS, the equational semantics of which requires functions to be totally defined. Such difficulty is stated to be a major limitation of LOTOS compared to VDM and Z, although the report claims that LOTOS is clearly superior to these methods as far as the description of communication protocols is concerned.

3.2 Executable Formal Specifications
As a continuation of the work undertaken at NPL, five formal specifications of the MAA have been developed at INRIA Grenoble. These specifications are executable, in the sense that all expressions that contain neither free variables nor infinite recursion can be given to some interpretation engine and evaluated to produce relevant results. But executable also means that these specifications can be compiled automatically (e.g., using the translators of the CADP toolbox [11]) into some executable program that will be run to generate the expected results. Let us review these five specifications:

LOTOS92 : In 1992, Hubert Garavel and Philippe Turlier, taking LOTOS91 as a starting point, gradually transformed it to obtain an executable specification from which the CÆSAR.ADT compiler [9] [14] could generate C code automatically. Their goal was to remain as close as possible to the original LOTOS specification of Harold B. Munster, so as to demonstrate that limited changes were sufficient to turn a nonexecutable LOTOS specification into an executable one. The aforementioned causes of nonexecutability in LOTOS91 were addressed by fulfilling the additional semantic constraints set on LOTOS by the CÆSAR.ADT compiler to make sure that LOTOS specifications are executable:

The algebraic equations, which are not oriented in standard LOTOS, were turned into term rewrite rules, which are oriented from left to right and, thus, more amenable to efficient translation.

A distinction was made between constructor and nonconstructor operations, and the discipline of “free” constructors required by CÆSAR.ADT [9] was enforced: namely, each rule defining a nonconstructor must have the form either “” or “”, where each is a term containing only constructors and free variables, and where , …, , and are terms whose variables must be also present in some .

To avoid issues with the unary notation of natural numbers, the Nat sort was implemented manually as a C type (32bit unsigned integer). Similarly, a few operations on sort Nat (integer constants, addition, multiplication, etc.) were also implemented by manually written C functions — the ability to import externally defined C types and functions, and to combine them with automatically generated C code being a distinctive feature of the CÆSAR.ADT compiler. Additionally, all occurrences of the sort BitString used for the binary notation of natural numbers, octets, and blocks were eliminated from the MAA specification.
This resulted in a 641line LOTOS specification, together with two C files (63 lines in total) implementing the LOTOS sorts and operations defined externally. The CÆSAR.ADT compiler translated this LOTOS specification into C code that, combined with a small handwritten main program (161 lines of C code), could compute the MAC value corresponding to a message and a key.


LNT16 : In February 2016, Wendelin Serwe manually translated LOTOS92 into LNT [2], which is the most recent specification language supported by the CADP toolbox and the stateoftheart replacement for LOTOS [12]
. This translation was done in a systematic way, the goal being to emphasize common structure and similarities between the LOTOS and LNT specifications. The resulting 543line LNT specification thus has the style of algebraic specifications and functional programs, relying massively on pattern matching and recursive functions. The handwritten C code imported by the LOTOS specification was reused, almost as is, for the LNT specification.

REC17 : Between September 2016 and February 2017, Hubert Garavel and Lina Marsso undertook the translation of LOTOS92 into a term rewrite system^{7}^{7}7Actually, it is a conditional term rewrite system with only six conditional rewrite rules that, if needed, can easily be turned into nonconditional rewrite rules as explained in [13].. This system was encoded in the simple language REC proposed in [7, Sect. 3] and [6, Sect. 3.1], which was lightly enhanced to distinguish between free constructors and nonconstructors.
Contrary to higherlevel languages such as LOTOS or LNT, REC is a purely theoretical language that does not allow to import external fragments of code written in a programming language. Thus, all types (starting by the most basic ones, such as Bit and Bool) and their associated operations were exhaustively defined “from scratch” in the REC language. To address the aforementioned problem with natural numbers, two different types were defined: a Nat used for “small” counters, the values of which do not exceed a few thousands, and a Block type that represents the 32bit machine words used for MAA calculations. The Nat was defined in the Peanostyle unary notation, while the Block sort was defined in binary notation (as a tuple sort containing or four octets, each composed of eight bits). To provide executable definitions for the modular arithmetic operations on type Block, the REC specification was equipped with 8bit, 16bit, and 32bit adders and multipliers, somehow inspired from the theory of digital circuits. To check whether the MAA calculations are correct or not, the REC specification was enriched with 203 test vectors [13, Annexes B.18 to B.21] originating from diverse sources.
The resulting REC specification has 1575 lines and contains 13 sorts, 18 constructors, 644 nonconstructors, and 684 rewrite rules. It is minimal, in the sense that each sort, constructor, and nonconstructor is actually used (i.e., the specification does not contain “dead” code). As far as we are aware, it is one of the largest handwritten term rewrite systems publicly available. Parts of this specification (e.g., the binary adders and multipliers) are certainly reusable for other purposes. However, it is fair to mention that term rewrite systems are a lowlevel theoretical model that does not scale well to large problems, and that it took considerable effort to come up with a REC specification that is readable and properly structured.
Using a collection of translators^{8}^{8}8http://gforge.inria.fr/scm/viewvc.php/rec/2015CONVECS developed at INRIA Grenoble, the REC specification was automatically translated into various languages: AProVE (TRS), Clean, Haskell, LNT, LOTOS, Maude, mCRL2, OCaml, Opal, Rascal, Scala, SML, Stratego/XT, and Tom. Using the interpreters, compilers, and checkers available for these languages, it was shown [13, Sect. 5] that the REC specification terminates, that it is confluent, and that all the 203 tests pass successfully. Also, the most involved components (namely, the binary adders and multipliers) were validated separately using more than 30,000 test vectors.
The two remaining formal specifications of the MAA are novel contributions of the present paper:

LOTOS17 : Between January and February 2017, Hubert Garavel and Lina Marsso performed a major revision of LOTOS92 based upon the detailed knowledge of the MAA acquired during the development of REC17 . Their goal was to produce an executable LOTOS specification as simple as possible, even if it departed from the original specification LOTOS91 written by Harold B. Munster. Many changes were brought: the two sorts AcceptableMessage and SegmentedMessage were removed, and the Nat sort was replaced almost everywhere by the Block sort; about seventy operations were removed, while a dozen new operations were added; the Block constructor evolved by taking four octets rather than thirtytwo bytes; the constructors of sort Message were replaced by standard list constructors; the equations defining various operations (FIX1, FIX2, BYT, PAT, etc.) were shortened; each message is now processed in a single pass without first duplicating it to build a list of segments; the Prelude operation is executed only once per message, rather than once per segment; the detection of messages larger than 1,000,000 blocks is now written directly in C. These changes led to a 266line LOTOS specification (see Annex C) with two companion C files (157 lines in total) implementing the basic operations on blocks^{9}^{9}9The most recent version of these files is available from ftp://ftp.inrialpes.fr/pub/vasy/demos/demo12/LOTOS.. Interestingly, all these files taken together are smaller than the original specification LOTOS91 , demonstrating that executability and conciseness are not necessarily antagonistic notions.

LNT17 : Between December 2016 and February 2017, Hubert Garavel and Lina Marsso entirely rewrote LNT16 in order to obtain a simpler specification. First, the same changes as for LOTOS17 were applied to the LNT specification. Also, the sorts Pair, TwoPairs, and ThreePairs, which had been introduced by Harold B. Munster to describe functions returning two, four, and six blocks, have been eliminated; this was done by having LNT functions that return their computed results using “out” or “in out” parameters (i.e., call by result or call by valueresult) rather than tuples of values; the principal functions (e.g., MUL1, MUL2, MUL2A, Prelude, Coda, MAC, etc.) have been simplified by taking advantage of the imperative style LNT, i.e., mutable variables and assignments; many auxiliary functions have been gathered and replaced by a few larger functions (e.g., PreludeJ, PreludeK, PreludeHJ, and PreludeHK) also written in the imperative style. These changes resulted in a 268line LNT specification with a 136line companion C file, which have nearly the same size as LOTOS17 , although the LNT version is more readable and closer to the original MAA specification [5], also expressed in the imperative style. Taken alone, the LNT code has approximately the same size as VDM90 , the nonexecutable specification that was included as a formal annex in the MAA standard [18].
As for REC17 , the LNT specification was then enriched with a collection of “assert” statements implementing: (i) the test vectors listed in Tables 1 to 6 of [18, Annex A] and [5]; (ii) the test vectors of [17, Annex E.3.3]; (iii) supplementary test vectors intended to specifically check for certain aspects (byte permutations and message segmentation) that were not enough covered by the above tests; this was done by introducing a makeMessage function acting as a pseudorandom message generator.
Consequently, the size of the LNT files grew up to 1334 lines in total (see Annex D)^{10}^{10}10The most recent version of these files is available from ftp://ftp.inrialpes.fr/pub/vasy/demos/demo12.. Finally, the remaining test vectors of [17, Annexes E.3.4 and E.4], which were too lengthy to be included in REC17 , have been stored in text files and can be checked by running the C code generated from the LNT specification. This makes of LNT17 the most complete formal specification of the MAA as far as validation is concerned.
4 Modelling issues
In this section, we investigate some salient issues faced when modelling the MAA using diverse formal methods. We believe that such issues are not specific to the MAA, but are likely to arise whenever nontrivial data structures and algorithms are to be described formally.
4.1 Local variables in function definitions
Local variables are essential to store computed results that need to be used several times, thus avoiding identical calculations to be repeated. LNT allows to freely define and assign local variables in an imperativeprogramming style; the existence of a formal semantics is guaranteed by static semantic constraints [10] ensuring that each variable is duly assigned before used. For instance, the MUL1 function^{11}^{11}11The same discussion is also valid for MUL2, MUL2A, and many other MAA functions. is expressed in LNT as follows:
function MUL1 (X, Y : Block) : Block is var U, L, S, C : Block in U := HIGH_MUL (X, Y); L := LOW_MUL (X, Y); S := ADD (U, L); C := CAR (U, L); assert (C == x00000000) or (C == x00000001); return ADD (S, C) end var end function
In VDM, which enjoys a “let” operator, the definition of MUL1 is very similar to the LNT one [26, page 11] [27, Sect. 2.2.5]. The situation is quite different for term rewrite systems and abstract data types, which lack a “let” operator in their rewrite rules or equations. Interestingly, LOTOS91 tries to emulate such a “let” operator by (ab)using the premises of conditional equations [25, pages 37 and 78]:
opns MUL1 : Block, Block > Block forall X, Y, U, L, S, P: Block, C: Bit NatNum (X) * NatNum (Y) = NatNum (U ++ L), NatNum (U) + NatNum (L) = NatNum (S) + NatNum (C), NatNum (C + S) = NatNum (P) => MUL1 (X, Y) = P;
These premises define and compute^{12}^{12}12These premises silently require the computation of inverse functions for NumNat, +, and ++ (bit string concatenation). the variables (U, L), (S, C), and P, respectively. Unfortunately, most languages and tools for term rewriting forbid such free variables in premises, requiring that only the parameters of the function under definition (here, X and Y for the MUL1 function) can occur in premises.
Instead, LOTOS17 and REC17 adopt a more conventional style in which auxiliary operations are introduced, the parameters of which are used to store computed results that need to be used more than once:
opns MUL1 : Block, Block > Block MUL1_UL : Block, Block > Block MUL1_SC : Block, Block > Block forall X, Y, U, L, S, C : Block MUL1 (X, Y) = MUL1_UL (HIGH_MUL (X, Y), LOW_MUL (X, Y)); MUL1_UL (U, L) = MUL1_SC (ADD (U, L), CAR (U, L)); MUL1_SC (S, C) = ADD (S, C);
In comparison, the imperativeprogramming style of LNT is clearly more concise, more readable, and closer to the original description of MUL1. Moreover, LNT permits successive assignments to the same variable, which proved to be useful in, e.g., the MainLoop and MAC functions.
4.2 Functions returning multiple results
Another point in which the various MAA specifications differ is the handling of functions that compute more than one result. There are several such functions in the MAA; let us consider the Prelude function, which takes two block parameters J and K and returns six block parameters X, Y, V, W, S, and T.
The simplest description of this function is achieved in LNT17, which exploits the fact that LNT functions, like in imperative programming languages, may return a result and/or have “out” parameters. In LNT, the Prelude function can be defined this way:
function Prelude (in J, K : Block, out X, Y, V, W, S, T : Block) is ... end function
and invoked as follows:
Prelude (J, K, ?X0, ?Y0, ?V0, ?W, ?S, ?T)
Although this approach is the simplest one, most formal methods do not support procedures or functions with “out” parameters^{13}^{13}13Besides LNT, the only other language we know to offer “out” parameters is the synchronous dataflow language Lustre.. In such languages where functions return only a single result, there are two different options for describing functions with multiple results such as Prelude.
The first option is return a unique result of some compound type (record, tuple, array, etc.). For instance, both VDM90 and Z91 describe Prelude as a function taking a pair of blocks and returning a result of a new type (called KeyConstant [26, Sections 2.2.2 and 2.2.7] or DerivedSpace [21, pages 45–46]) defined as a sextuple of blocks. LOTOS91 and LOTOS17 adopt a similar approach by defining Prelude to return a result of a new sort ThreePairs, which is a triple of Pair values, where sort Pair is itself defined as a pair of blocks. Other examples can be found in the binary adders and multipliers of REC17 ; for instance, the 8bit adder returns a result of sort OctetSum that is a pair gathering a sum (of sort Octet) and a carry (of sort Sum).
The drawbacks of this first option are numerous: (i) new types have to be introduced — potentially one type per defined function in the worst case; (ii) each of these types introduces in turn a constructor and, often, equality and projection functions as well; (iii) the specification gets obscured by tupling/detupling operations, with the aggravating circumstance that detupling can be performed in different ways (pattern matching, destructuring “let”, or projection functions), which makes it difficult to follow the flow of a particular variable embedded in a tuple of values; (iv) tupling complicates the efforts of compilers and garbage collector to allocate memory efficiently.
The second option is to split a function returning results into separate functions. For instance, REC17 has split Prelude into three operations: preludeXY, which computes the pair , preludeVW, which computes the pair , and preludeST, which computes the pair . This transformation applied to Prelude and to the mainloop functions enabled the sorts TwoPairs and ThreePairs introduced in LOTOS91 to be entirely removed from REC17 .
The drawbacks of this second option are twofold: (i) splitting a function with multiple results might be difficult if the calculations for these results are tightly intertwined; this was not the case with the six Prelude results, each of which does not depend on the five other ones^{14}^{14}14This was pointed out as a cryptographic weakness of the MAA in [34, Sect. 6].; (ii) splitting may require to duplicate identical calculations, and thus create inefficiencies that in turn may require the introduction of auxiliary functions to be avoided.
5 Validation of MAA Specifications
The two most recent specifications of the MAA have been validated as follows:

LNT17 : The specification was validated by the LNT2LOTOS translator, which implements the syntactic checks and (part of) the semantic checks stated in the definition of LNT [2] and generates LOTOS code, which is then validated by the CÆSAR.ADT compiler, therefore performing the remaining semantics checks of LNT. The C code generated by the CÆSAR.ADT compiler passed the test vectors specified in [18, Annex A], in [17, Annexes E.3], in [17, Annexes E.3.4 and E.4], and the supplementary test vectors based on the MakeMessage function.
Due to these checks, various mistakes were discovered in prior (informal and formal) specifications of the MAA: (i) Annex A corrects the test vectors given in [17, Annex E]; (ii) Annex B corrects the test vectors given for function PAT in [18, Annex A] and [5]; (iii) an error was found in the main C program, which computed an incorrect MAC value, as the list of blocks storing the message was built in reverse order; (iv) another error was found in the external implementation in C of the function HIGHMUL, which computes the highest 32 bits of the 64bit product of two blocks and is imported by the LOTOS and LNT specifications — this illustrates the risks arising when formal and nonformal codes are mixed.
6 Conclusion
Twentyfive years after, we revisited the Message Authenticator Algorithm (MAA), which used to be a pioneering case study for cryptography in the 80s and for formal methods in the early 90s. The three MAA specifications VDM90 , Z91 , and LOTOS91 developed at NPL in 1990–1991 were clearly leadingedge, as can be seen from the adoption of the VDM specification as part of the ISO international standard 87312 in 1992. However, they also faced limitations: these were mostly penandpencil formal methods that lacked automated validation tools and that required implementations to be developed manually, thus raising the difficult question of the compatibility between the formal specification and the handwritten implementation code.
A different path has been followed at INRIA Grenoble since the early 90s, with an emphasis on executable formal methods, from which implementations can be generated automatically. Five specifications have been successively developed: LOTOS92 , LNT16 , REC17 , LOTOS17 , and LNT17 . Retrospectively, heading towards executable formal methods proved to be a successful bet:

It turns out that executable specifications are not necessarily longer than nonexecutable ones: LNT17 and LOTOS17 (345 and 423 lines, respectively, including the external C code fragments) are half way between the nonexecutable specifications VDM90 (275 lines) and Z91 (608 lines). Also, LNT17 is only 60% larger than the direct implementation in C given in [5].

One might argue that the LOTOS and LNT specifications are not entirely formal, as they import a few C types and functions to implement blocks and arithmetic operations on blocks. We see this as a strength, rather than a weakness, of our approach. Moreover, nothing prevents such external types and functions to be instead defined in LOTOS or in LNT, as this was the case with the REC17 specification, which was then automatically translated to selfcontained, fullyformal LOTOS and LNT specifications that were successfully compiled and executed.

The insight gained by comparing the eight formal specifications of the MAA confirms that LNT is a formal method of choice for modelling complex algorithms and data structures. Compared to other formalisms, LNT offers an imperative specification style (based on mutable variables and assignments) that proved to be simpler to write, easier to read, more concise, and closer to the MAA description in natural language [5], from which specifications based on term rewrite systems and abstract data types significantly depart due to picky technical restrictions in these latter formalisms. LNT also favors a more disciplined specification style that, we believe, is of higher quality because of the numerous staticanalysis checks (e.g., unused variables, useless assignments, etc.) performed by the LNT2LOTOS translator; such strict controls are, to the best of our knowledge, absent from most other specification languages.

The application of executable formal methods to the MAA case study was fruitful in several respects: (i) it detected errors in the reference test vectors given in ISO standards 8730 and 87312; (ii) the LOTOS specification of the MAA, due to its size and complexity, was helpful in improving early versions of the CÆSAR.ADT compiler; (iii) similarly, the LNT specification of the MAA revealed in the LNT2LOTOS translator a few defects and performance issues, which have been dealt with in 2016 and 2017.

Moreover, executable formal methods benefit from significant progress in their compiling techniques. In 1990, a handwritten implementation of the MAA in Miranda took 60 seconds to process an 84block message and 480 seconds to process a 588block message [26, page 37]. Today, the implementations automatically generated from the LNT and LOTOS specifications of the MAA take 0.65 and 0.37 second, respectively, to process a onemillionblock message^{15}^{15}15The C code generated from LNT and LOTOS by the CADP translators was compiled using “gcc O3” and ran on a Dell Latitude E6530 laptop.. As it appears, “formal” and “executable” are no longer mutually exclusive qualities.
Acknowledgements
We are grateful to Philippe Turlier who, in 1992, helped turning the nonexecutable LOTOS specification of Harold B. Munster into an executable one, to Wendelin Serwe, who, in 2016, produced the first LNT specification of the MAA, and to Frédéric Lang, who, in 2016–2017, improved the LNT2LOTOS translator to address the issues pointed out. Acknowledgements are also due to Keith Lockstone for his advice and his web site^{16}^{16}16http://www.cix.co.uk/~klockstone giving useful information about the MAA, and to Sharon Wilson, librarian of the National Physical Laboratory, who provided us with valuable early NPL reports that cannot be fetched from the web.
References
 [1]
 [2] David Champelovier, Xavier Clerc, Hubert Garavel, Yves Guerte, Christine McKinty, Vincent Powazny, Frédéric Lang, Wendelin Serwe & Gideon Smeding (2017): Reference Manual of the LNT to LOTOS Translator (Version 6.7). Available at http://cadp.inria.fr/publications/ChampelovierClercGaraveletal10.html. INRIA/VASY and INRIA/CONVECS, 130 pages.
 [3] Donald W. Davies (1985): A Message Authenticator Algorithm Suitable for a Mainframe Computer. In G. R. Blakley & David Chaum, editors: Advances in Cryptology – Proceedings of the Workshop on the Theory and Application of Cryptographic Techniques (CRYPTO’84), Santa Barbara, CA, USA, Lecture Notes in Computer Science 196, Springer, pp. 393–400, doi:http://dx.doi.org/10.1007/3540395687˙30.
 [4] Donald W. Davies & David O. Clayden (1983): A Message Authenticator Algorithm Suitable for a Mainframe Computer. NPL Report DITC 17/83, National Physical Laboratory, Teddington, Middlesex, UK.
 [5] Donald W. Davies & David O. Clayden (1988): The Message Authenticator Algorithm (MAA) and its Implementation. NPL Report DITC 109/88, National Physical Laboratory, Teddington, Middlesex, UK. Available at http://www.cix.co.uk/~klockstone/maa.pdf.
 [6] Francisco Durán, Manuel Roldán, JeanChristophe Bach, Emilie Balland, Mark van den Brand, James R. Cordy, Steven Eker, Luc Engelen, Maartje de Jonge, Karl Trygve Kalleberg, Lennart C. L. Kats, PierreEtienne Moreau & Eelco Visser (2010): The Third Rewrite Engines Competition. In Peter Csaba Ölveczky, editor: Proceedings of the 8th International Workshop on Rewriting Logic and Its Applications (WRLA’10), Paphos, Cyprus, Lecture Notes in Computer Science 6381, Springer, pp. 243–261, doi:http://dx.doi.org/10.1007/9783642163104˙16.
 [7] Francisco Durán, Manuel Roldán, Emilie Balland, Mark van den Brand, Steven Eker, Karl Trygve Kalleberg, Lennart C. L. Kats, PierreEtienne Moreau, Ruslan Schevchenko & Eelco Visser (2009): The Second Rewrite Engines Competition. Electronic Notes in Theoretical Computer Science 238(3), pp. 281–291, doi:http://dx.doi.org/10.1016/j.entcs.2009.05.025.
 [8] Hartmut Ehrig & Bernd Mahr (1985): Fundamentals of Algebraic Specification 1 – Equations and Initial Semantics. EATCS Monographs on Theoretical Computer Science 6, Springer, doi:http://dx.doi.org/10.1007/9783642699627.
 [9] Hubert Garavel (1989): Compilation of LOTOS Abstract Data Types. In Son T. Vuong, editor: Proceedings of the 2nd International Conference on Formal Description Techniques FORTE’89 (Vancouver B.C., Canada), NorthHolland, pp. 147–162. Available at http://cadp.inria.fr/publications/Garavel89c.html.
 [10] Hubert Garavel (2015): Revisiting Sequential Composition in Process Calculi. Journal of Logical and Algebraic Methods in Programming 84(6), pp. 742–762, doi:http://dx.doi.org/10.1016/j.jlamp.2015.08.001.
 [11] Hubert Garavel, Frédéric Lang, Radu Mateescu & Wendelin Serwe (2013): CADP 2011: A Toolbox for the Construction and Analysis of Distributed Processes. Springer International Journal on Software Tools for Technology Transfer (STTT) 15(2), pp. 89–107, doi:http://dx.doi.org/10.1007/s100090120244z. Available at http://cadp.inria.fr/publications/GaravelLangMateescuSerwe13.html.
 [12] Hubert Garavel, Frédéric Lang & Wendelin Serwe (2017): From LOTOS to LNT. In JoostPieter Katoen, Rom Langerak & Arend Rensink, editors: ModelEd, TestEd, TrustEd – Essays Dedicated to Ed Brinksma on the Occasion of His 60th Birthday, Lecture Notes in Computer Science 10500, Springer, pp. 3–26, doi:http://dx.doi.org/10.1007/9783319682709˙1.
 [13] Hubert Garavel & Lina Marsso (2017): A Large Term Rewrite System Modelling a Pioneering Cryptographic Algorithm. In Holger Hermanns & Peter Höfner, editors: Proceedings of the 2nd Workshop on Models for Formal Analysis of Real Systems (MARS’17), Uppsala, Sweden, Electronic Proceedings in Theoretical Computer Science 244, pp. 129–183, doi:http://dx.doi.org/10.4204/EPTCS.244.6.
 [14] Hubert Garavel & Philippe Turlier (1993): CÆSAR.ADT : un compilateur pour les types abstraits algébriques du langage LOTOS. In Rachida Dssouli & Gregor v. Bochmann, editors: Actes du Colloque Francophone pour l’Ingénierie des Protocoles (CFIP’93), Montréal, Canada, Hermès, Paris, pp. 325–339. Available at http://cadp.inria.fr/publications/GaravelTurlier93.html.
 [15] ISO (1986): Requirements for Message Authentication (Wholesale). International Standard 8730, International Organization for Standardization – Banking, Geneva.
 [16] ISO (1987): Approved Algorithms for Message Authentication – Part 2: Message Authenticator Algorithm (MAA). International Standard 87312, International Organization for Standardization – Banking, Geneva.
 [17] ISO (1990): Requirements for Message Authentication (Wholesale). International Standard 8730, International Organization for Standardization – Banking, Geneva.
 [18] ISO (1992): Approved Algorithms for Message Authentication – Part 2: Message Authenticator Algorithm. International Standard 87312, International Organization for Standardization – Banking, Geneva.
 [19] ISO (1999): Requirements for Message Authentication (Wholesale). Technical Corrigendum 1 8730, International Organization for Standardization – Banking, Geneva.
 [20] ISO/IEC (1989): LOTOS – A Formal Description Technique Based on the Temporal Ordering of Observational Behaviour. International Standard 8807, International Organization for Standardization – Information Processing Systems – Open Systems Interconnection, Geneva.
 [21] M. K. F. Lai (1991): A Formal Interpretation of the MAA Standard in Z. NPL Report DITC 184/91, National Physical Laboratory, Teddington, Middlesex, UK.
 [22] R. P. Lampard (1991): An Implementation of MAA from a VDM Specification. NPL Technical Memorandum DITC 50/91, National Physical Laboratory, Teddington, Middlesex, UK.
 [23] Jan de Meer, Rudolf Roth & Son Vuong (1992): Introduction to Algebraic Specifications Based on the Language ACT ONE. Computer Networks and ISDN Systems 23(5), pp. 363–392, doi:http://dx.doi.org/10.1016/01697552(92)90013G.
 [24] Alfred Menezes, Paul C. van Oorschot & Scott A. Vanstone (1996): Handbook of Applied Cryptography. CRC Press, doi:http://dx.doi.org/10.1201/9781439821916. Available at http://cacr.uwaterloo.ca/hac.
 [25] Harold B. Munster (1991): LOTOS Specification of the MAA Standard, with an Evaluation of LOTOS. NPL Report DITC 191/91, National Physical Laboratory, Teddington, Middlesex, UK. Available at ftp://ftp.inrialpes.fr/pub/vasy/publications/others/Munster91a.pdf.
 [26] Graeme I. Parkin & G. O’Neill (1990): Specification of the MAA Standard in VDM. NPL Report DITC 160/90, National Physical Laboratory, Teddington, Middlesex, UK.
 [27] Graeme I. Parkin & G. O’Neill (1991): Specification of the MAA Standard in VDM. In Søren Prehn & W. J. Toetenel, editors: Formal Software Development – Proceedings (Volume 1) of the 4th International Symposium of VDM Europe (VDM’91), Noordwijkerhout, The Netherlands, Lecture Notes in Computer Science 551, Springer, pp. 526–544, doi:http://dx.doi.org/10.1007/3540548343˙31.
 [28] Bart Preneel (1997): Cryptanalysis of Message Authentication Codes. In Eiji Okamoto, George I. Davida & Masahiro Mambo, editors: Proceedings of the 1st International Workshop on Information Security (ISW’97), Tatsunokuchi, Japan, Lecture Notes in Computer Science 1396, Springer, pp. 55–65, doi:http://dx.doi.org/10.1007/BFb0030408. Available at http://www.cosic.esat.kuleuven.be/publications/article61.pdf.
 [29] Bart Preneel (2011): MAA. In Henk C. A. van Tilborg & Sushil Jajodia, editors: Encyclopedia of Cryptography and Security (2nd Edition), Springer, pp. 741–742, doi:http://dx.doi.org/10.1007/9781441959065˙591.
 [30] Bart Preneel & Paul C. van Oorschot (1995): MDxMAC and Building Fast MACs from Hash Functions. In Don Coppersmith, editor: Advances in Cryptology – Proceedings of 15th Annual International Cryptology Conference (CRYPTO’95), Santa Barbara, CA, USA, Lecture Notes in Computer Science 963, Springer, pp. 1–14, doi:http://dx.doi.org/10.1007/3540447504˙1. Available at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.490.8595&rep=rep1&type=pdf.
 [31] Bart Preneel & Paul C. van Oorschot (1996): On the Security of Two MAC Algorithms. In Ueli M. Maurer, editor: Advances in Cryptology – Proceedings of the International Conference on the Theory and Application of Cryptographic Techniques (EUROCRYPT’96), Saragossa, Spain, Lecture Notes in Computer Science 1070, Springer, pp. 19–32, doi:http://dx.doi.org/10.1007/3540683399˙3.
 [32] Bart Preneel & Paul C. van Oorschot (1999): On the Security of Iterated Message Authentication Codes. IEEE Transactions on Information Theory 45(1), pp. 188–199, doi:http://dx.doi.org/10.1109/18.746787.
 [33] Bart Preneel, Vincent Rumen & Paul C. van Oorschot (1997): Security Analysis of the Message Authenticator Algorithm (MAA). European Transactions on Telecommunications 8(5), pp. 455–470, doi:http://dx.doi.org/10.1002/ett.4460080504.
 [34] Vincent Rijmen, Bart Preneel & Erik De Win (1996): Key Recovery and Collision Clusters for MAA. In: Proceedings of the 1st International Conference on Security in Communication Networks (SCN’96). Available at https://www.cosic.esat.kuleuven.be/publications/article437.pdf.
Appendix A Errata Concerning Annex E of the ISO8730:1990 Standard
After reading and checking carefully the test vectors given in [17, Annex E], we discovered a number of errors^{17}^{17}17We used the French version of that standard, which we acquired from AFNOR, but have no reason to believe that the same errors are absent from other translations of this standard.. Here is the list of errors found and their corrections:

In Annex E.2, some characters of the text message differ from the corresponding ASCII code given (in hexadecimal) below in Annex E.3.2. Precisely, the string
"BE CAREFUL"
should read"BE\n\n\ \ \ Careful"
, where"\n"
and"\ "
respectively denote linefeed and white space. The corresponding hexadecimal values are indeed 42 45 0A 0A 20 20 20 43 61 72 65 66 75 6C. 
Annex E.3.2 and Annex E.3.4 state that this text message has 86 blocks. Actually, it has 84 blocks only. This is confirmed by the table of hexadecimal values in Annex E.3.2 (42 lines 2 blocks per line give 84 blocks) and by the iterations listed in Annex E.3.4, in which the number of message blocks (i.e., variable N) ranges between 1 and 84.

Annex E.4 states that the long message is obtained by repeating six times the message of 86 blocks, leading to a message length of 516 blocks. Actually, it is obtained by repeating seven times the message of 84 blocks, leading to a message length of 588 blocks. This can be seen from the iterations listed in Annex E.4 where variable N ranges between 1 and 588, and by the fact that . Moreover, computing the MAA result on the 588block long message with the same key J = E6 A1 2F 07 and K = 9D 15 C4 37 as in Annex E.3.3 indeed gives the expected MAC value C6 E3 D0 00.
Appendix B Errata Concerning Annex A of the ISO87312:1992 Standard
After checking carefully all the test vectors contained in the original NPL report defining the MAA [5] and in the 1992 version of the MAA standard [18], we believe that there are mistakes^{18}^{18}18Again, we used the French version of this standard, but we believe that this plays no role, as the same mistakes were already present in the 1988 NPL report. in the test vectors given for function PAT.
More precisely, the three last lines of Table 3 [5, page 15] — identically reproduced in Table A.3 of [18, Sect. A.4] — are written as follows:
Ψ{X0,Y0} 0103 0703 1D3B 7760 PAT{X0,Y0} EE Ψ{V0,W} 0103 050B 1706 5DBB PAT{V0,W} BB Ψ{S,T} 0103 0705 8039 7302 PAT{S,T} E6
Actually, the inputs of function PAT should not be {X0,Y0}
, {V0,W}
, {S,T}
but rather {H4,H5}
, {H6,H7}
, {H8,H9}
, the values of H4, …, H9
being those listed above in Table 3. Notice that the confusion was probably caused by the following algebraic identities:
Ψ{X0,Y0} = BYT (H4, H5) Ψ{V0,W} = BYT (H6, H7) Ψ{S,T} = BYT (H8, H9)
If one gives {X0,Y0}
, {V0,W}
, {S,T}
as inputs to PAT, then the three results of PAT are equal to 00 and thus cannot be equal to EE, BB, E6, respectively.
But if one gives {H4,H5}
, {H6,H7}
, {H8,H9}
as inputs to PAT, then the results of PAT are the expected values EE, BB, E6.
Thus, we believe that the three last lines of Table 3 should be modified as follows:
Ψ{H4,H5} 0000 0003 0000 0060 PAT{H4,H5} EE Ψ{H6,H7} 0003 0000 0006 0000 PAT{H6,H7} BB Ψ{H8,H9} 0000 0005 8000 0002 PAT{H8,H9} E6
Appendix C Formal Specification of the MAA in LOTOS
This annex presents the specification LOTOS17 of the MAA in LOTOS. This specification uses several predefined libraries of LOTOS, namely: the libraries for Booleans and natural numbers, which we do not reproduce here, and the libraries for bits, octets, and octet values, of which we only display excerpts needed for understanding the MAA specification.
c.1 The BIT library
This predefined LOTOS library defines the Bit type with its related operations. Only a simplified version of this library is presented here.
c.2 The OCTET library
This predefined LOTOS library defines the Octet type (i.e., an 8bit word) with its related operations. Only an excerpt of this library is presented here.
c.3 The OCTETVALUES library
This predefined LOTOS library defines 256 constant functions x00, …, xFF that provide shorthand notations for octet values. Only an excerpt of this library is presented here.
c.4 The MAA specification
Appendix D Formal Specification of the MAA in LNT
This annex presents the specification LNT17 of the MAA in LNT. This specification uses several predefined libraries of LNT, namely: the libraries for Booleans and natural numbers, which we do not reproduce here, and the libraries for bits, octets, and octet values, of which we only display excerpts needed for understanding the MAA specification. It also defines two new libraries for blocks and block values, which we display hereafter.
d.1 The BIT library
This predefined LNT library defines the Bit type with its related operations. Only an excerpt of this library is presented here.
d.2 The OCTET library
This predefined LNT library defines the Octet type (i.e., an 8bit word) with its related operations. Only an excerpt of this library is presented here.
d.3 The OCTETVALUES library
This predefined LNT library defines 256 constant functions x00, …, xFF that provide shorthand notations for octet values. Only an excerpt of this library is presented here.
d.4 The BLOCK library
This library defines the Block type (i.e., a 32bit word) with its logical and arithmetical operations, the latter being implemented externally as a set of functions written in the C language.
d.5 The BLOCKVALUES library
This library defines constant functions x00000000, …, xFFFFFFFF that provide shorthand notations for block values. Only the useful constants (207 among ) are defined. An excerpt of this library is presented here.
Comments
There are no comments yet.