Log In Sign Up

Neural Networks in Imandra: Matrix Representation as a Verification Choice

The demand for formal verification tools for neural networks has increased as neural networks have been deployed in a growing number of safety-critical applications. Matrices are a data structure essential to formalising neural networks. Functional programming languages encourage diverse approaches to matrix definitions. This feature has already been successfully exploited in different applications. The question we ask is whether, and how, these ideas can be applied in neural network verification. A functional programming language Imandra combines the syntax of a functional programming language and the power of an automated theorem prover. Using these two key features of Imandra, we explore how different implementations of matrices can influence automation of neural network verification.


page 1

page 2

page 3

page 4


CheckINN: Wide Range Neural Network Verification in Imandra (Extended)

Neural networks are increasingly relied upon as components of complex sa...

Neural Network Verification for the Masses (of AI graduates)

Rapid development of AI applications has stimulated demand for, and has ...

PubSub implementation in Haskell with formal verification in Coq

In the cloud, the technology is used on-demand without the need to insta...

Benchmarking the Status of Default Pseudorandom Number Generators in Common Programming Languages

The ever-increasing need for random numbers is clear in many areas of co...

1 Motivation

Neural network (NN) verification was pioneered by the SMT-solving [12, 11] and an abstract interpretation [20, 8, 2] communities. However, recently claims have been made that functional programming, too, can be valuable in this domain. There is a library [16] formalising small rational-valued neural networks in Coq. A more sizeable formalisation called MLCert [3]

imports neural networks from Python, treats floating point numbers as bit vectors, and proves properties describing the generalisation bounds for the neural networks. An

formalisation [14] uses reals and refinement types for proving the robustness of networks trained in Python.

There are several options for defining neural networks in functional programming, ranging from defining neurons as record types 

[16] to treating them as functions with refinement types [14]. But we claim that two general considerations should be key to any NN formalisation choice. Firstly, we must define neural networks as executable functions, because we want to take advantage of executing them in the functional language of choice. Secondly, a generic approach to layer definitions is needed, particularly when we implement complex neural network architectures, such as convolutional layers.

These two essential requirements dictate that neural networks are represented as matrices, and that a programmer makes choices about matrix formalisation. This article will explain these choices, and the consequences they imply, from the verification point of view. We use Imandra [17] to make these points, because Imandra is a functional programming language with tight integration of automated proving.

Imandra has been successful as a user-friendly and scalable tool in the FinTech domain [18]. The secret of its success lies in a combination of many of the best features of functional languages and interactive and automated theorem provers. Imandra’s logic is based on a pure, higher-order subset of OCaml, and functions written in Imandra are at the same time valid OCaml code that can be executed, or “simulated”. Imandra’s mode of interactive proof development is based on a typed, higher-order lifting of the Boyer-Moore waterfall [4] for automated induction, tightly integrated with novel techniques for SMT modulo recursive functions.

This paper builds upon the recent development of a CheckINN, a NN verification library in Imandra [6], but discusses specifically the matrix representation choices and their consequences.

2 Matrices in Neural Network Formalisation

We will illustrate the functional approach to neural network formalisation and will introduce the syntax of the Imandra programming language [17] by means of an example. When we say we want to formalise neural networks as functions, essentially, we aim to be able to define a NN using just a line of code:

  let cnn input =
      layer_0 input >>= layer_1 >>= layer_2 >>= layer_3

where each layer_i is defined in a modular fashion.

To see that a functional approach to neural networks does not necessarily imply generic nature of the code, let us consider an example. A perceptron, also known as a linear classifier

, classifies a given input vector

into one of two classes or by computing a linear combination of the input vector with a vector of synaptic weights , in which is often called an intercept or bias: . If the result is positive, it classifies the input as and if negative as

. It effectively divides the input space along a hyperplane defined by


In most classification problems, classes are not linearly separated. To handle such problems, we can apply a non-linear function called an activation function to the linear combination of weights and inputs. The resulting definition of a perceptron is:


Let us start with a naive prototype of perceptron in Imandra. The Iris data set is a “Hello World” example in data mining; it represents 3 kinds of Iris flowers using 4 selected features. In Imandra, inputs can be represented as a data type:

type iris_input = {
  sepal_len: real;
  sepal_width: real;
  petal_len: real;
  petal_width: real;}

And we define a perceptron as a function:

let layer_0 (w0, w1, w2, w3, w4) (x1, x2, x3, x4) =
  relu (w0 +. w1 *. x1 +. w2 *. x2 +. w3 *. x3 +. w4 *. x4)

where *. and +. are times and plus defined on reals. Note the use of the relu activation function, which returns for all negative inputs and acts as the identity function otherwise.

Already in this simple example, one perceptron is not sufficient, as we must map its output to three classes. We use the usual machine learning literature trick and define a further layer of

neurons, each representing one class. Each of these neurons is itself a perceptron, with one incoming weight and one bias. This gives us:

let layer_1 (w1, b1, w2, b2, w3, b3) f1 =
  let o1 = w1 *. f1 +. b1 in
  let o2 = w2 *. f1 +. b2 in
  let o3 = w3 *. f1 +. b3 in
  (o1, o2, o3)
let process_iris_output (c0, c1, c2) =
  if (c0 >=. c1) && (c0 >=. c2) then "setosa"
  else if (c1 >=. c0) && (c1 >=. c2) then "versicolor"
  else "virginica"

The second function maps the output of the three neurons to the three specified classes. This post-processing stage often takes the form of an argmax or softmax function, which we omit.

And thus the resulting function that defines our neural network model is:

let model input = process_iris_input input
        |> layer_0 weights_0 |> layer_1 weights_1 |>

Although our naive formalisation has some features that we desired from the start, i.e. it defines a neural network as a composition of functions, it is too inflexible to work with arbitrary compositions of layers. In neural networks with hundreds of weights in every layer this manual approach will quickly become infeasible (as well as error-prone). So, let us generalise this attempt from the level of individual neurons to the level of matrix operations.

The composition of many perceptrons is often called a multi-layer perceptron (MLP). An MLP consists of an input vector (also called input layer in the literature), multiple hidden layers and an output layer, each layer made of perceptrons with weighted connections to the previous layers’ outputs. The weight and biases of all the neurons in a layer can be represented by two matrices denoted by and . By adapting equation 1 to this matrix notation, a layer’s output can be defined as:


where the operator denotes the dot product between and each row of , is the layer’s input and is the activation function shared by all nodes in a layer. As the dot product multiplies pointwise all inputs by all weights, such layers are often called fully connected.

By denoting — the activation function, weights and biases of the th layer respectively, an MLP with layers is traditionally defined as:


At this stage, we are firmly committed to using matrices and matrix operations. And we have two key choices:

  1. to represent matrices as lists of lists (and take advantage of the inductive data type List),

  2. define matrices as functions from indices to matrix elements,

  3. or take advantage of record types, and define matrices as records with maps.

The first choice was taken in [10] (in the context of dependent types in Coq), in  [14] (in the context of refinement types of F) and in [9] (for sparse matrix encodings in Haskell). The difference between the first and second approaches was discussed in [22] (in Agda, but with no neural network application in mind). The third method was taken in  [16] using Coq (records were used there to encode individual neurons).

In the next three sections, we will systematise these three approaches using the same formalism and the same language, and trace the influence of these choices on neural network verification.

3 Matrices as Lists of Lists

We start with re-using Imandra’s List library. Lists are defined as inductive data structures:

type a list =
| []
| (::) of a * a list

Imandra holds a comprehensive library of list operations covering a large part of OCaml’s standard List libary, which we re-use in the definitions below. We start with defining vectors as lists, and matrices as lists of vectors.

type a vector = a list
type a matrix = a vector list

It is possible to extend this formalisation by using dependent [10] or refinement [14] types to check the matrix size, e.g. when performing matrix multiplication. But in Imandra this facility is not directly available, and we will need to use error-tracking (implemented via the monadic Result type) to facilitate checking of the matrix sizes.

As there is no built-in type available for matrices equivalent to List for vectors, the Matrix module implements a number of functions for basic operations needed throughout the implementation. For instance, map2 takes as inputs a function and two matrices and of the same dimensions and outputs a new matrix where each element is the result of :

  let rec map2 (f:’a -> b -> c) (x:’a matrix) (y:’b matrix) =
    match x with
          | [] -> begin match y with
                      | [] -> Ok []
                        | y::ys  -> Error "map2:invalidlength." end
          | x::xs -> begin match y with
                         | [] -> Error "map2:invalidlength."
                           | y::ys -> let hd = map2 f x y in
                                      let tl = map2 f xs ys in
                                      lift2 cons hd tl end

This implementation allows us to define other useful functions concisely. For instance, the dot-product of two matrices is defined as:

let dot_product (a:real matrix) (b:real matrix): (’a, real matrix) result = sum (map2 ( *. ) a b)

Note that since the output of the function map2 is wrapped in the monadic result type, we must use to apply sum. Similarly, we use standard monadic operations for the result monad such as bind or lift.

A fully connected layer is then defined as a function fc that takes as parameters an activation function, a 2-dimensional matrix of layer’s weights and an input vector:

let activation f w i = (* activation func., weights, input *)
 let linear_combination m1 m2 = if (length m1) <> (length m2)
     then Error "invaliddimensions"
     else map sum (Vec.map2 ( *. ) m1 m2) in
 let i = 1.::i in (* prepend 1. for bias *)
 let z = linear_combination w i in
 map f z
let rec fc f (weights:real matrix) (input:real vector) =
 match weights with
 | [] -> Ok []
 | w::ws -> lift2 cons (activation f w input) (fc f ws input)
Listing 1: Fully connected layer implementation

Note that each row of the weights matrix represents the weights for one of the layer’s nodes. The bias for each node is the first value of the weights vector, and is prepended to the input vector when computing the dot-product of weights and input to account for that.

It is now easy to see that our desired modular approach to composing layers works as stated. We may define the layers using the syntax: let layer_i = fc a weights, where i stands for 0,1,2,3, and a stands for any chosen activation function.

Although natural, this formalisation of layers and networks suffers from two problems. Firstly, it lacks the matrix dimension checks that were readily provided via refinement types in [14]. This is because Imandra is based on a computational fragment of HOL, and has no refinement or dependent types. To mitigate this, our library performs explicit dimension checking via a result monad, which clutters the code and adds additional computational checks. Secondly, the matrix definition via the list datatypes makes verification of neural networks very inefficient. This general effect has been already reported in [14], but it may be instructive to look into the problem from the Imandra perspective.

Robustness of neural networks [5] is best amenable to proofs by arithmetic manipulation. This explains the interest of the SMT-solving community in the topic, which started with using Z3 directly [11], and has resulted in highly efficient SMT solvers specialised on robustness proofs for neural networks [12, 13]. Imandra’s waterfall method [17] defines a default flow for the proof search, which starts with unrolling inductive definitions, simplification and rewriting. As a result, proofs of neural network robustness or proofs as in the ACAS Xu challenge [12, 13], which do not actually need induction, are not efficiently tackled using Imandra’s inductive waterfall: the proofs simply do not terminate.

There is another verification approach available in Imandra which is better suited for this type of problem: blast, a tactic for SAT-based symbolic execution modulo higher-order recursive functions. Blast is an internal custom SAT/SMT solver that can be called explicitly to discharge an Imandra verification goal. However, blast currently does not support real arithmetic. This requires us to quantize the neural networks we use (i.e. convert them to integer weights) and results in a quantised NN implementation [6]. However, even with quantisation and the use of blast

, while we succeed on many smaller benchmarks, Imandra fails to scale ‘out of the box’ to the ACAS Xu challenge, let alone larger neural networks used in computer vision.

This also does not come as a surprise: as [12] points out, general-purpose SMT solvers do not scale to NN verification challenges. This is why the algorithm reluplex was introduced in [12]

as an additional heuristic to SMT solver algorithms;

reluplex has since given rise to a domain specific solver Marabou [13]. Connecting Imandra to Marabou may be a promising future direction.

However, this method of matrix formalisation can still bring benefits. When we formulate verification properties that genuinely require induction, formalisation of matrices as lists does result in more natural, and easily automatable proofs. For example, De Maria et al. [16] formalise in Coq “neuronal archetypes” for biological neurons. Each archetype is a specialised kind of perceptron, in which additional functions are added to amplify or inhibit the perceptron’s outputs. It is out of the scope of this paper to formalise the neuronal archetypes in Imandra, but we take methodological insight from [16]. In particular, De Maria et al. show that there are natural higher-order properties that one may want to verify.

To make a direct comparison, modern neural network verifiers [12, 20] deal with verification tasks of the form “given a trained neural network , and a property on its inputs, verify that a property holds for ’s outputs”. However, the formalisation in [16] considers properties of the form “any neural network that satisfies a property , also satisfies a property .” Unsurprisingly, the former kind of properties can be resolved by simplification and arithmetic, whereas the latter kind requires induction on the structure of (as well as possibly nested induction on parameters of ).

Another distinguishing consequence of this approach is that it is orthogonal to the community competition for scaling proofs to large networks: usually the property does not restrict the size of neural networks, but rather points to their structural properties. Thus, implicitly we quantify over neural networks of any size.

To emulate a property à la de Maria et al., in [6] we defined a general network monotonicity property: any fully connected network with positive weights is monotone, in the sense that, given increasing positive inputs, its outputs will also increase. There has been some interest in monotone networks in the literature [19, 21]. Our experiments show that Imandra can prove such properties by induction on the networks’ structure almost automatically (with the help of a handful of auxiliary lemmas). And the proofs easily go through for both quantised and real-valued neural networks.111Note that in these experiments, the implementation of weight matrices as lists of lists is implicit – we redefine matrix manipulation functions that are less general but more convenient for proofs by induction.

4 Matrices as Functions

We now return to the verification challenge of ACAS Xu, which we failed to conquer with the inductive matrix representation of the last section. This time we ask whether representing matrices as functions and leveraging Imandra’s default proof automation can help.

With this in mind, we redefine matrices as functions from indices to values, which gives constant-time (recursion-free) access to matrix elements:

type arg =
  | Rows
  | Cols
  | Value of int * int
type a t = arg -> a
Listing 2: Implementation of matrices as functions from indices to values

Note the use of the arg type, which treats a matrix as a function evaluating “queries” (e.g., “how many rows does this matrix have?” or “what is the value at index ?”). This formalisation technique is used as Imandra’s logic does not allow function values inside of algebraic data types. We thus recover some functionality given by refinement types in [14].

Furthermore, we can map functions over a matrix or a pair of matrices (using map2), transpose a matrix, construct a diagonal matrix etc. without any recursion, since we work point-wise on the elements. At the same time, we remove the need for error tracking to ensure matrices are of the correct size: because our matrices are total functions, they are defined everywhere (even outside of their stated dimensions), and we can make the convention that all matrices we build are valid and sparse by construction (with default 0 outside of their dimension bounds).

The resulting function definitions are much more succinct than with lists of lists; take for instance map2:

let map2 (f: a -> b -> c) (m: a t) (m’: b t) : c t =
                | Rows -> rows m
                | Cols -> cols m
                | Value (i,j) -> f (m (Value (i,j))) (m (Value (i,j)))

This allows us to define fully connected layers:

let fc (f: a -> b) (weights: a Matrix.t) (input: a Matrix.t) =
        let open Matrix in
                | Rows         -> 1
                | Cols         -> rows weights
                | Value (0, j) ->
                        let input = add_weight_coeff input in
                        let weights_row = nth_row weights j in
                        f (dot_product weights_row input’)
                | Value _      -> 

As the biases are included in the weights matrix, add_weight_coeff prepends a column with coefficients to the input so that they are taken into account.

For full definitions of matrix operations and layers, the reader is referred to [6], but we will give some definitions here, mainly to convey the general style (and simplicity!) of the code. Working with the ACAS Xu networks [12], a script transforms the original networks into sparse functional matrix representation. For example, layer 5 of one of the networks we used is defined as follows:

let layer5 = fc relu (
  | Rows -> 50
  | Cols -> 51
  | Value (i,j) -> Map.get (i,j) layer5_map)
let layer5_map =
  Map.add (0,0) (1) @@
  Map.add (0,10) (-1) @@
  Map.add (0,29) (-1) @@

The sparsity effect is achieved by pruning the network, i.e. removing weights that have the smallest impact on the network’s performance. The weight’s magnitude is used to select those to be pruned. This method, though rudimentary, is considered a reasonable pruning technique [15]. We do this mainly in order to reduce the amount of computation Imandra needs to perform, and to make the verification problem amenable to Imandra.

With this representation, we are able to verify the properties described in  [12] on some of the pruned networks (see Table 1). This is a considerable improvement compared to the previous section, where the implementation did not allow to verify even pruned networks. It is especially impressive that it comes “for free” by simply changing the underlying matrix representations.

Several factors played a role in automating the proof. Firstly, by using maps for the large matrices, we eliminate all recursion (and large case-splits) except for matrix folds (which now come in only via the dot product), which allowed Imandra to expand the recursive matrix computations on demand. Secondly, Imandra’s native simplifier contributed to the success. It works on a DAG representation of terms and speculatively expands instances of recursive functions, only as they are (heuristically seen to be) needed. Incremental congruence closure and simplex data structures are shared across DAG nodes, and symbolic execution results are memoised. The underlying Map.t components of the functions are reasoned about using a decision procedure for the theory of arrays. Informally speaking, Imandra works lazily expanding out the linear algebra as it is needed, and eagerly sharing information over the DAG. Contrast this approach with that of Reluplex which, informally, starts with the linear algebra fully expanded, and then works to derive laziness and sharing.

Although Imandra’s simplifier-based automation above could give us results which blast could not deliver for the same network, it still did not scale to the original non-quantised (dense) ACAS Xu network. Contrast this with domain-specific verifiers such as Marabou which are able to scale (modulo potential floating point imprecision) to the full ACAS Xu. We are encouraged that the results of this section were achieved without tuning Imandra’s generic proof automation strategies, and hopeful that the development of neural-network specific tactics will help Imandra scale to such networks in the future.

CheckINN: Pruned Networks Reluplex: Full ACAS Xu Networks
Property Result Quantity Time (s) Quantity Time (s)
SAT 20 13387 0
UNSAT 0 41 394517
SAT 7 2098 35 82419
UNSAT 2 896 1 463
SAT 39 10453 35 82419
UNSAT 0 1 463
SAT 36 21533 0
UNSAT 0 32 12475
SAT 1 98 0
UNSAT 0 1 19355
SAT 1 98 0
UNSAT 0 1 180288
SAT 0 1 40102
SAT 1 109 0
UNSAT 0 1 99634
SAT 0 0
UNSAT 1 637 1 19944
Table 1: Results of experiments ran on the properties and networks from the ACAS Xu benchmark [12]. The CheckINN verifications were run with 90% of the weights pruned, on virtual machines with four 2.6 GHz Intel Ice Lake virtual processors and 16GB RAM. Timeout was set to 5 hours

5 Real-valued Matrices; Records and Arrays

It is time we turn to the question of incorporating real values into matrices. Section 3 defined matrices as lists of lists; and that definition in principle worked for both integer and real-valued matrices. However, we could not use [@@blast] to automate proofs when real values were involved; this meant we were restricted to verifying integer-valued networks. On the other hand, the matrix as function implementation extends to proofs with real valued matrices, however it is not a trivial extension. In the functional implementation, the matrix’s value must be of the same type as its dimensions (Listing 2). Thus, if the matrix elements are real-valued, then in this representation the matrix dimensions will be real-valued as well. This, it turns out, is not trivial to deal with for functions which do recursion along matrix dimensions.

To simplify the code and the proofs, three potential solutions were considered:

  • Using an algebraic data type for results of matrix queries: this introduces pattern matching in the implementation of matrix operations, which Section 

    3 taught us to avoid.

  • Define a matrix type with real-valued dimensions and values: this poses the problem of proving the function termination when using matrix dimensions in recursion termination conditions.

  • Use records to provide polymorphism and allow matrices to use integer dimensions and real values.

This section focuses on these three alternatives.

5.1 Algebraic Data Types for Real-Valued Matrices

The first alternative is to introduce an algebraic data type that allows the matrix functions to return either reals or integers.

type arg =
        | Rows
        | Cols
        | Value of int * int
        | Default
type a res =
        | Int of int
        | Val of a
type a t = arg -> a res

This allows a form of polymorphism, but it also introduces pattern matching each time we query a value from the matrix. For instance, in order to use dimensions as indices to access a matrix element we have to implement the following nth_res function:

let nth_res (m: a t) (i: b res) (j: c res): a res = match (i, j) with
        | (Int i’, Int j’) -> m (Value (i’, j’))
        | _                -> m Default

The simplicity and efficiency of the functional implementation is lost.

5.2 Real-Valued Matrix Indices

We then turn to using real numbers to encode matrix dimensions. The implementation is symmetric to the one using integers (Listing 2):

type arg =
        | Rows
        | Cols
        | Value of real * real
type a t = arg -> a

A problem arises in recursive functions where matrix dimensions are used as decrementors in stopping conditions, for instance in the fold_rec function used in the implementation of the folding operation.

let rec fold_rec f cols i j (m: a t) =
        let dec i j =
                if j <=. 0. then (i-.1.,cols) else (i,j-.1.)
        if (i <=. 0. && j <=. 0.) || (i <. 0. || j <. 0.) then (
                m (Value (i,j))
        ) else (
                let i’,j = dec i j in
                f (m (Value (i,j))) (fold_rec f cols i j m)
let fold (f : a -> b -> b) (m: a t) : b =
        let rows = m Rows -. 1. in
        let cols = m Cols -. 1. in
        fold_rec f cols rows cols m

Imandra only accepts definitions of functions for which it can prove termination. The dimensions being real numbers prevents Imandra from being able to prove termination without providing a custom measure. In order to define this measure, we need to connect the continuous world of reals with the discrete world of integers (and ultimately ordinals) for which we have induction principles. We chose to develop a floor function that allows Imandra to prove termination with reals.

To prove termination of our fold_rec function recursing along reals, we define an int_of_real : real -> int function in Imandra, using a subsidiary floor : real -> int -> int which computes an integer floor of a real by “counting up” using its integer argument. In fact, as matrices have non-negative dimensions, it suffices to only consider this conversion for non-negative reals, and we formalise only this. We then have to prove some subsidiary lemmas about the arithmetic of real-to-integer conversion, such as:

lemma floor_mono x y b =
  Real.(x <= y && x >= 0. && y >= 0.)
  ==> floor x b <= floor y b
lemma inc_by_one_bigger_conv x =
  Real.(x >= 0. ==> int_of_real (x + 1.0) > int_of_real x)

Armed with these results, we can then prove termination of fold_rec and admit it into Imandra’s logic via the ordinal pair measure below:

[@@measure Ordinal.pair
            (Ordinal.of_int (int_of_real i))
            (Ordinal.of_int (int_of_real j))]

Extending the functional matrix implementation to reals was not trivial, but it did have a real payoff. Using this representation, we were able to verify real-valued versions of the pruned ACAS Xu networks! In both cases of integer and real-valued matrices, we pruned the networks to 10% of their original size. So, we still do not scale to the full ACAS Xu challenge. However, the positive news is that the real-valued version of the proofs uses the same waterfall proof tactic of Imandra, and requires no extra effort from the programmer to complete the proof. This result is significant bearing in mind that many functional and higher-order theorem provers are known to have significant drawbacks when switching to real numbers.

From the functional programming point of view, one may claim that this approach is not elegant enough because it does not provide true polymorphism as it encodes matrix dimensions as reals. This motivates us to try the third alternative, using records with maps to achieve polymorphism.

5.3 Records

Standard OCaml records are available in Imandra, though they do not support functions as fields. This is because all records are data values which must support a computable equality relation, and in general one cannot compute equality on functions. Internally in the logic, records correspond to algebraic data types with a single constructor and the record fields to named constructor arguments. Like product types, records allow us to group together values of different types, but with convenient accessors and update syntax based on field names, rather than position. This offers the possibility of polymorphism for our matrix type.

The approach here is similar to the one in Section 4: matrices are stored as mappings between indices and values, which allows for constant-time access to the elements. However, instead of having the mapping be implemented as a function, here we implement it as a Map, i.e. an unordered collection of (key;value) pairs where each key is unique, so that this “payload” can be included as the field of a record.

type a t = {
        rows: int;
        cols: int;
        vals: ((int*int), a) Map.t;

We can then use a convenient syntax to create a record of this type. For instance, a weights matrix from one of the ACAS Xu networks can be implemented as:

let layer6_map =
        Map.add (0,10) (0.05374) @@
        Map.add (0,20) (0.05675) @@
        Map.const 0.
let layer6_matrix = {
        rows = 5;
        cols = 51;
        vals = layer6_map;

Note that the matrix dimensions (and the underlying map’s keys) are indeed encoded as integers, whereas the weights’ values are reals.

Similarly to the previous implementations, we define a number of useful matrix operations which will be used to define general neural network layer functions. For instance, the map2 function is defined thus:

let rec map2_rec (m: a t) (m’: b t) (f: a -> b -> c) (cols: int) (i: int) (j: int) (res: ((int*int), c) Map.t): ((int*int), c) Map.t =
                let dec i j =
                        if j <= 0 then (i-1, cols) else (i,j-1)
                if i <= 0 && j <= 0 then (
                ) else (
                        let (i’,j’) = dec i j in
                        let new_value = f (nth m (i’,j’)) (nth m (i’, j’)) in
                        let res = Map.add res (i’,j’) new_value in
                        map2_rec m m f cols i j res
[@@adm i,j]
let map2 (f: a -> b -> c) (m: a t) (m’: b t) : c t =
        let rows = max (m.rows) (m’.rows) in
        let cols = max (m.cols) (m’.cols) in
        let vals = map2_rec m m f cols rows cols (Map.const 0.) in
                rows = rows;
                cols = cols;
                vals = vals;

Compared to the list implementation, this implementation has the benefit of providing constant-time access to matrix elements. However, compared to the implementation of matrices as functions, it uses recursion to iterate over matrix values which results in a high number of case-splits. This in turn results in lower scalability. Compared to the previous section’s results, none of the verification tests on pruned ACAS Xu benchmarks that terminated with the functional matrix implementation terminated with the records implementation.

Moreover, we can see in the above function definition that we lose considerable conciseness and readability.

In the end, the main interest of this implementation is its offering polymorphism. In all other regards, the functional implementation seems preferable.

6 Conclusions

Functional programming languages that are tightly coupled with automated reasoning capabilities, like Imandra, offer us the possibility to verify and perform inference with neural networks, which the library CheckINN aims to do. To that aim, implementing matrices and matrix operations is important. We have shown different implementations of matrices and how each implementation influences verification in Imandra.

This study has three positive conclusions:

  • Imandra’s language is sufficiently flexible to give rise to implementations of several choices of matrix in the CheckINN library. Its proof heuristics adapt smoothly to these different implementations, with very little hints needed to figure out the appropriate proof strategy (induction, waterfall or SAT/SMT proving).

  • this flexibility bears benefits when it comes to diversifying the range of NN properties we verify: thus, matrices as lists made possible proofs of higher-order properties by induction, whereas matrices as functions were more amenable to automated proofs in SAT/SMT solving style.

  • the transition from integer-valued to real-valued NNs is possible in Imandra. This transition itself opens several choices for matrix representations. However, if the matrix representation is optimal for the task at hand, Imandra takes care of completing the proofs with reals and adapts its inductive waterfall method to the new data type automatically. This is a positive lesson to learn, as this is not always given in functional theorem provers.

The main drawback is our failure to scale to the full ACAS Xu problem regardless of the matrix implementation choice in CheckINN [6]. However, it may not come as a great surprise, as general-purpose SMT solvers do not scale to the problem, either. It took domain-specific algorithms such as ReluPlex and special-purpose solvers such as Marabou to overcome the scaling problem [12, 13]. This suggests future solutions that are somewhat orthogonal to the choice of the matrix representation:

  • interface with Marabou or other specialised NN solvers in order to scale;

  • work on a set of Imandra’s native proof heuristics and tactics, tailored specifically to Imandra’s NN formalisations.

In addition, evaluating CheckINN against other benchmarks would allow to assess more accurately its scalability on different problems, e.g. robustness verification of image classification networks on the MNIST dataset

[1] or range analysis of randomly generated networks [7]. We leave these as future work.

These conclusions provide a strong foundation to further develop the CheckINN library, as its aim is to offer verification of a wide array of neural network properties and we have shown that the choice of matrix implementation eventually influences the range of verifiable properties.

Finally, we believe that the methods we described could be useful in other theorem provers (both first- and higher-order) that combine functional programming and automated proof methods, such as ACL2, PVS, Isabelle/HOL and Coq. For example, in all these systems functions defining matrix operations (e.g., convolution) over lists are often more complex compared to their counterparts over matrices represented as functions, which can benefit from non-recursive definitions. Overall, as these various prominent theorem proving systems work ultimately with functional programs over algebraic datatypes like Imandra, our core observations carry over to them in a natural way.


  • [1] VNN 2022,
  • [2] Ayers, E.W., Eiras, F., Hawasly, M., Whiteside, I.: PaRoT: A practical framework for robust deep neural network training. In: NASA Formal Methods - 12th International Symposium, NFM 2020, Moffett Field, CA, USA, May 11-15, 2020, Proceedings. LNCS, vol. 12229, pp. 63–84. Springer (2020)
  • [3] Bagnall, A., Stewart, G.: Certifying true error: Machine learning in Coq with verified generalisation guarantees. AAAI (2019)
  • [4] Boyer, R.S., Moore, J.S.: A Computational Logic. ACM Monograph Series. Academic Press, New York (1979)
  • [5] Casadio, M., Komendantskaya, E., Daggitt, M.L., Kokke, W., Katz, G., Amir, G., Refaeli, I.: Neural network robustness as a verification property: A principled case study. In: Computer Aided Verification (CAV 2022). Lecture Notes in Computer Science, Springer (2022)
  • [6] Desmartin, R., Passmore, G., Komendantskaya, E., Daggitt, M.L.: CNN library in Imandra. (2022)
  • [7] Dutta, S., Jha, S., Sankaranarayanan, S., Tiwari, A.: Output Range Analysis for Deep Feedforward Neural Networks. In: Dutle, A., Muñoz, C., Narkawicz, A. (eds.) NASA Formal Methods. pp. 121–138. Lecture Notes in Computer Science, Springer International Publishing, Cham (2018).
  • [8] Gehr, T., Mirman, M., Drachsler-Cohen, D., Tsankov, P., Chaudhuri, S., Vechev, M.T.: AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation. In: S&P (2018)
  • [9] Grant, P.W., Sharp, J.A., Webster, M.F., Zhang, X.: Sparse matrix representations in a functional language. Journal of Functional Programming 6(1), 143–170 (Jan 1996).,, publisher: Cambridge University Press
  • [10] Heras, J., Poza, M., Dénès, M., Rideau, L.: Incidence Simplicial Matrices Formalized in Coq/SSReflect. In: Davenport, J.H., Farmer, W.M., Urban, J., Rabe, F. (eds.) Intelligent Computer Mathematics. pp. 30–44. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg (2011).
  • [11] Huang, X., Kwiatkowska, M., Wang, S., Wu, M.: Safety verification of deep neural networks. In: Computer Aided Verification - 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I. Lecture Notes in Computer Science, vol. 10426, pp. 3–29 (2017)
  • [12] Katz, G., Barrett, C., Dill, D., Ju-lian, K., Kochenderfer, M.: Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks. In: CAV (2017)
  • [13] Katz, G., Huang, D.A., Ibeling, D., Julian, K., Lazarus, C., Lim, R., Shah, P., Thakoor, S., Wu, H., Zeljic, A., Dill, D.L., Kochenderfer, M.J., Barrett, C.W.: The Marabou framework for verification and analysis of deep neural networks. In: CAV 2019, Part I. LNCS, vol. 11561, pp. 443–452. Springer (2019)
  • [14] Kokke, W., Komendantskaya, E., Kienitz, D., Atkey, R., Aspinall, D.: Neural networks, secure by construction - an exploration of refinement types. In: Programming Languages and Systems - 18th Asian Symposium, APLAS 2020, Fukuoka, Japan, November 30 - December 2, 2020, Proceedings. Lecture Notes in Computer Science, vol. 12470, pp. 67–85. Springer (2020)
  • [15] LeCun, Y., Denker, J., Solla, S.: Optimal Brain Damage. In: Advances in Neural Information Processing Systems. vol. 2. Morgan-Kaufmann (1989),
  • [16] Maria, E.D., Bahrami, A., L’Yvonnet, T., Felty, A.P., Gaffé, D., Ressouche, A., Grammont, F.: On the use of formal methods to model and verify neuronal archetypes. Frontiers Comput. Sci. 16(3), 163404 (2022)
  • [17] Passmore, G.O., Cruanes, S., Ignatovich, D., Aitken, D., Bray, M., Kagan, E., Kanishev, K., Maclean, E., Mometto, N.: The imandra automated reasoning system (system description). In: Automated Reasoning - 10th International Joint Conference, IJCAR 2020, Paris, France, July 1-4, 2020, Proceedings, Part II. vol. 12167, pp. 464–471. Springer (2020)
  • [18] Passmore, G.O.: Some lessons learned in the industrialization of formal methods for financial algorithms. In: Formal Methods - 24th International Symposium, FM 2021, Virtual Event, November 20-26, 2021, Proceedings. Lecture Notes in Computer Science, vol. 13047, pp. 717–721. Springer (2021)
  • [19] Sill, J.: Monotonic Networks. California Institute of Technology (1998)
  • [20] Singh, G., Gehr, T., Püschel, M., Vechev, M.T.: An abstract domain for certifying neural networks. PACMPL 3(POPL), 41:1–41:30 (2019).
  • [21] Wehenkel, A., Louppe, G.: Unconstrained monotonic neural networks. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada. pp. 1543–1553 (2019)
  • [22] Wood, J.: Vectors and Matrices in Agda (Aug 2019),