1 Motivation
Neural network (NN) verification was pioneered by the SMTsolving [12, 11] and an abstract interpretation [20, 8, 2] communities. However, recently claims have been made that functional programming, too, can be valuable in this domain. There is a library [16] formalising small rationalvalued neural networks in Coq. A more sizeable formalisation called MLCert [3]
imports neural networks from Python, treats floating point numbers as bit vectors, and proves properties describing the generalisation bounds for the neural networks. An
formalisation [14] uses reals and refinement types for proving the robustness of networks trained in Python.There are several options for defining neural networks in functional programming, ranging from defining neurons as record types
[16] to treating them as functions with refinement types [14]. But we claim that two general considerations should be key to any NN formalisation choice. Firstly, we must define neural networks as executable functions, because we want to take advantage of executing them in the functional language of choice. Secondly, a generic approach to layer definitions is needed, particularly when we implement complex neural network architectures, such as convolutional layers.These two essential requirements dictate that neural networks are represented as matrices, and that a programmer makes choices about matrix formalisation. This article will explain these choices, and the consequences they imply, from the verification point of view. We use Imandra [17] to make these points, because Imandra is a functional programming language with tight integration of automated proving.
Imandra has been successful as a userfriendly and scalable tool in the FinTech domain [18]. The secret of its success lies in a combination of many of the best features of functional languages and interactive and automated theorem provers. Imandra’s logic is based on a pure, higherorder subset of OCaml, and functions written in Imandra are at the same time valid OCaml code that can be executed, or “simulated”. Imandra’s mode of interactive proof development is based on a typed, higherorder lifting of the BoyerMoore waterfall [4] for automated induction, tightly integrated with novel techniques for SMT modulo recursive functions.
This paper builds upon the recent development of a CheckINN, a NN verification library in Imandra [6], but discusses specifically the matrix representation choices and their consequences.
2 Matrices in Neural Network Formalisation
We will illustrate the functional approach to neural network formalisation and will introduce the syntax of the Imandra programming language [17] by means of an example. When we say we want to formalise neural networks as functions, essentially, we aim to be able to define a NN using just a line of code:
where each layer_i is defined in a modular fashion.
To see that a functional approach to neural networks does not necessarily imply generic nature of the code, let us consider an example. A perceptron, also known as a linear classifier
, classifies a given input vector
into one of two classes or by computing a linear combination of the input vector with a vector of synaptic weights , in which is often called an intercept or bias: . If the result is positive, it classifies the input as and if negative as. It effectively divides the input space along a hyperplane defined by
.In most classification problems, classes are not linearly separated. To handle such problems, we can apply a nonlinear function called an activation function to the linear combination of weights and inputs. The resulting definition of a perceptron is:
(1) 
Let us start with a naive prototype of perceptron in Imandra. The Iris data set is a “Hello World” example in data mining; it represents 3 kinds of Iris flowers using 4 selected features. In Imandra, inputs can be represented as a data type:
And we define a perceptron as a function:
where *. and +. are times and plus defined on reals. Note the use of the relu activation function, which returns for all negative inputs and acts as the identity function otherwise.
Already in this simple example, one perceptron is not sufficient, as we must map its output to three classes. We use the usual machine learning literature trick and define a further layer of
neurons, each representing one class. Each of these neurons is itself a perceptron, with one incoming weight and one bias. This gives us:The second function maps the output of the three neurons to the three specified classes. This postprocessing stage often takes the form of an argmax or softmax function, which we omit.
And thus the resulting function that defines our neural network model is:
Although our naive formalisation has some features that we desired from the start, i.e. it defines a neural network as a composition of functions, it is too inflexible to work with arbitrary compositions of layers. In neural networks with hundreds of weights in every layer this manual approach will quickly become infeasible (as well as errorprone). So, let us generalise this attempt from the level of individual neurons to the level of matrix operations.
The composition of many perceptrons is often called a multilayer perceptron (MLP). An MLP consists of an input vector (also called input layer in the literature), multiple hidden layers and an output layer, each layer made of perceptrons with weighted connections to the previous layers’ outputs. The weight and biases of all the neurons in a layer can be represented by two matrices denoted by and . By adapting equation 1 to this matrix notation, a layer’s output can be defined as:
(2) 
where the operator denotes the dot product between and each row of , is the layer’s input and is the activation function shared by all nodes in a layer. As the dot product multiplies pointwise all inputs by all weights, such layers are often called fully connected.
By denoting — the activation function, weights and biases of the th layer respectively, an MLP with layers is traditionally defined as:
(3) 
At this stage, we are firmly committed to using matrices and matrix operations. And we have two key choices:

to represent matrices as lists of lists (and take advantage of the inductive data type List),

define matrices as functions from indices to matrix elements,

or take advantage of record types, and define matrices as records with maps.
The first choice was taken in [10] (in the context of dependent types in Coq), in [14] (in the context of refinement types of F) and in [9] (for sparse matrix encodings in Haskell). The difference between the first and second approaches was discussed in [22] (in Agda, but with no neural network application in mind). The third method was taken in [16] using Coq (records were used there to encode individual neurons).
In the next three sections, we will systematise these three approaches using the same formalism and the same language, and trace the influence of these choices on neural network verification.
3 Matrices as Lists of Lists
We start with reusing Imandra’s List library. Lists are defined as inductive data structures:
Imandra holds a comprehensive library of list operations covering a large part of OCaml’s standard List libary, which we reuse in the definitions below. We start with defining vectors as lists, and matrices as lists of vectors.
It is possible to extend this formalisation by using dependent [10] or refinement [14] types to check the matrix size, e.g. when performing matrix multiplication. But in Imandra this facility is not directly available, and we will need to use errortracking (implemented via the monadic Result type) to facilitate checking of the matrix sizes.
As there is no builtin type available for matrices equivalent to List for vectors, the Matrix module implements a number of functions for basic operations needed throughout the implementation. For instance, map2 takes as inputs a function and two matrices and of the same dimensions and outputs a new matrix where each element is the result of :
This implementation allows us to define other useful functions concisely. For instance, the dotproduct of two matrices is defined as:
Note that since the output of the function map2 is wrapped in the monadic result type, we must use Result.map to apply sum. Similarly, we use standard monadic operations for the result monad such as bind or lift.
A fully connected layer is then defined as a function fc that takes as parameters an activation function, a 2dimensional matrix of layer’s weights and an input vector:
Note that each row of the weights matrix represents the weights for one of the layer’s nodes. The bias for each node is the first value of the weights vector, and is prepended to the input vector when computing the dotproduct of weights and input to account for that.
It is now easy to see that our desired modular approach to composing layers works as stated. We may define the layers using the syntax: let layer_i = fc a weights, where i stands for 0,1,2,3, and a stands for any chosen activation function.
Although natural, this formalisation of layers and networks suffers from two problems. Firstly, it lacks the matrix dimension checks that were readily provided via refinement types in [14]. This is because Imandra is based on a computational fragment of HOL, and has no refinement or dependent types. To mitigate this, our library performs explicit dimension checking via a result monad, which clutters the code and adds additional computational checks. Secondly, the matrix definition via the list datatypes makes verification of neural networks very inefficient. This general effect has been already reported in [14], but it may be instructive to look into the problem from the Imandra perspective.
Robustness of neural networks [5] is best amenable to proofs by arithmetic manipulation. This explains the interest of the SMTsolving community in the topic, which started with using Z3 directly [11], and has resulted in highly efficient SMT solvers specialised on robustness proofs for neural networks [12, 13]. Imandra’s waterfall method [17] defines a default flow for the proof search, which starts with unrolling inductive definitions, simplification and rewriting. As a result, proofs of neural network robustness or proofs as in the ACAS Xu challenge [12, 13], which do not actually need induction, are not efficiently tackled using Imandra’s inductive waterfall: the proofs simply do not terminate.
There is another verification approach available in Imandra which is better suited for this type of problem: blast, a tactic for SATbased symbolic execution modulo higherorder recursive functions. Blast is an internal custom SAT/SMT solver that can be called explicitly to discharge an Imandra verification goal. However, blast currently does not support real arithmetic. This requires us to quantize the neural networks we use (i.e. convert them to integer weights) and results in a quantised NN implementation [6]. However, even with quantisation and the use of blast
, while we succeed on many smaller benchmarks, Imandra fails to scale ‘out of the box’ to the ACAS Xu challenge, let alone larger neural networks used in computer vision.
This also does not come as a surprise: as [12] points out, generalpurpose SMT solvers do not scale to NN verification challenges. This is why the algorithm reluplex was introduced in [12]
as an additional heuristic to SMT solver algorithms;
reluplex has since given rise to a domain specific solver Marabou [13]. Connecting Imandra to Marabou may be a promising future direction.However, this method of matrix formalisation can still bring benefits. When we formulate verification properties that genuinely require induction, formalisation of matrices as lists does result in more natural, and easily automatable proofs. For example, De Maria et al. [16] formalise in Coq “neuronal archetypes” for biological neurons. Each archetype is a specialised kind of perceptron, in which additional functions are added to amplify or inhibit the perceptron’s outputs. It is out of the scope of this paper to formalise the neuronal archetypes in Imandra, but we take methodological insight from [16]. In particular, De Maria et al. show that there are natural higherorder properties that one may want to verify.
To make a direct comparison, modern neural network verifiers [12, 20] deal with verification tasks of the form “given a trained neural network , and a property on its inputs, verify that a property holds for ’s outputs”. However, the formalisation in [16] considers properties of the form “any neural network that satisfies a property , also satisfies a property .” Unsurprisingly, the former kind of properties can be resolved by simplification and arithmetic, whereas the latter kind requires induction on the structure of (as well as possibly nested induction on parameters of ).
Another distinguishing consequence of this approach is that it is orthogonal to the community competition for scaling proofs to large networks: usually the property does not restrict the size of neural networks, but rather points to their structural properties. Thus, implicitly we quantify over neural networks of any size.
To emulate a property à la de Maria et al., in [6] we defined a general network monotonicity property: any fully connected network with positive weights is monotone, in the sense that, given increasing positive inputs, its outputs will also increase. There has been some interest in monotone networks in the literature [19, 21]. Our experiments show that Imandra can prove such properties by induction on the networks’ structure almost automatically (with the help of a handful of auxiliary lemmas). And the proofs easily go through for both quantised and realvalued neural networks.^{1}^{1}1Note that in these experiments, the implementation of weight matrices as lists of lists is implicit – we redefine matrix manipulation functions that are less general but more convenient for proofs by induction.
4 Matrices as Functions
We now return to the verification challenge of ACAS Xu, which we failed to conquer with the inductive matrix representation of the last section. This time we ask whether representing matrices as functions and leveraging Imandra’s default proof automation can help.
With this in mind, we redefine matrices as functions from indices to values, which gives constanttime (recursionfree) access to matrix elements:
Note the use of the arg type, which treats a matrix as a function evaluating “queries” (e.g., “how many rows does this matrix have?” or “what is the value at index ?”). This formalisation technique is used as Imandra’s logic does not allow function values inside of algebraic data types. We thus recover some functionality given by refinement types in [14].
Furthermore, we can map functions over a matrix or a pair of matrices (using map2), transpose a matrix, construct a diagonal matrix etc. without any recursion, since we work pointwise on the elements. At the same time, we remove the need for error tracking to ensure matrices are of the correct size: because our matrices are total functions, they are defined everywhere (even outside of their stated dimensions), and we can make the convention that all matrices we build are valid and sparse by construction (with default 0 outside of their dimension bounds).
The resulting function definitions are much more succinct than with lists of lists; take for instance map2:
This allows us to define fully connected layers:
As the biases are included in the weights matrix, add_weight_coeff prepends a column with coefficients to the input so that they are taken into account.
For full definitions of matrix operations and layers, the reader is referred to [6], but we will give some definitions here, mainly to convey the general style (and simplicity!) of the code. Working with the ACAS Xu networks [12], a script transforms the original networks into sparse functional matrix representation. For example, layer 5 of one of the networks we used is defined as follows:
The sparsity effect is achieved by pruning the network, i.e. removing weights that have the smallest impact on the network’s performance. The weight’s magnitude is used to select those to be pruned. This method, though rudimentary, is considered a reasonable pruning technique [15]. We do this mainly in order to reduce the amount of computation Imandra needs to perform, and to make the verification problem amenable to Imandra.
With this representation, we are able to verify the properties described in [12] on some of the pruned networks (see Table 1). This is a considerable improvement compared to the previous section, where the implementation did not allow to verify even pruned networks. It is especially impressive that it comes “for free” by simply changing the underlying matrix representations.
Several factors played a role in automating the proof. Firstly, by using maps for the large matrices, we eliminate all recursion (and large casesplits) except for matrix folds (which now come in only via the dot product), which allowed Imandra to expand the recursive matrix computations on demand. Secondly, Imandra’s native simplifier contributed to the success. It works on a DAG representation of terms and speculatively expands instances of recursive functions, only as they are (heuristically seen to be) needed. Incremental congruence closure and simplex data structures are shared across DAG nodes, and symbolic execution results are memoised. The underlying Map.t components of the functions are reasoned about using a decision procedure for the theory of arrays. Informally speaking, Imandra works lazily expanding out the linear algebra as it is needed, and eagerly sharing information over the DAG. Contrast this approach with that of Reluplex which, informally, starts with the linear algebra fully expanded, and then works to derive laziness and sharing.
Although Imandra’s simplifierbased automation above could give us results which blast could not deliver for the same network, it still did not scale to the original nonquantised (dense) ACAS Xu network. Contrast this with domainspecific verifiers such as Marabou which are able to scale (modulo potential floating point imprecision) to the full ACAS Xu. We are encouraged that the results of this section were achieved without tuning Imandra’s generic proof automation strategies, and hopeful that the development of neuralnetwork specific tactics will help Imandra scale to such networks in the future.
CheckINN: Pruned Networks  Reluplex: Full ACAS Xu Networks  
Property  Result  Quantity  Time (s)  Quantity  Time (s) 
SAT  20  13387  0  
UNSAT  0  41  394517  
TIMEOUT  24  4  
SAT  7  2098  35  82419  
UNSAT  2  896  1  463  
TIMEOUT  26  4  
SAT  39  10453  35  82419  
UNSAT  0  1  463  
TIMEOUT  2  4  
SAT  36  21533  0  
UNSAT  0  32  12475  
TIMEOUT  5  0  
SAT  1  98  0  
UNSAT  0  1  19355  
SAT  1  98  0  
UNSAT  0  1  180288  
TIMEOUT  1  1  
SAT  0  1  40102  
TIMEOUT  1  0  
SAT  1  109  0  
UNSAT  0  1  99634  
SAT  0  0  
UNSAT  1  637  1  19944  
TIMEOUT  0  0 
5 Realvalued Matrices; Records and Arrays
It is time we turn to the question of incorporating real values into matrices. Section 3 defined matrices as lists of lists; and that definition in principle worked for both integer and realvalued matrices. However, we could not use [@@blast] to automate proofs when real values were involved; this meant we were restricted to verifying integervalued networks. On the other hand, the matrix as function implementation extends to proofs with real valued matrices, however it is not a trivial extension. In the functional implementation, the matrix’s value must be of the same type as its dimensions (Listing 2). Thus, if the matrix elements are realvalued, then in this representation the matrix dimensions will be realvalued as well. This, it turns out, is not trivial to deal with for functions which do recursion along matrix dimensions.
To simplify the code and the proofs, three potential solutions were considered:

Using an algebraic data type for results of matrix queries: this introduces pattern matching in the implementation of matrix operations, which Section
3 taught us to avoid. 
Define a matrix type with realvalued dimensions and values: this poses the problem of proving the function termination when using matrix dimensions in recursion termination conditions.

Use records to provide polymorphism and allow matrices to use integer dimensions and real values.
This section focuses on these three alternatives.
5.1 Algebraic Data Types for RealValued Matrices
The first alternative is to introduce an algebraic data type that allows the matrix functions to return either reals or integers.
This allows a form of polymorphism, but it also introduces pattern matching each time we query a value from the matrix. For instance, in order to use dimensions as indices to access a matrix element we have to implement the following nth_res function:
The simplicity and efficiency of the functional implementation is lost.
5.2 RealValued Matrix Indices
We then turn to using real numbers to encode matrix dimensions. The implementation is symmetric to the one using integers (Listing 2):
A problem arises in recursive functions where matrix dimensions are used as decrementors in stopping conditions, for instance in the fold_rec function used in the implementation of the folding operation.
Imandra only accepts definitions of functions for which it can prove termination. The dimensions being real numbers prevents Imandra from being able to prove termination without providing a custom measure. In order to define this measure, we need to connect the continuous world of reals with the discrete world of integers (and ultimately ordinals) for which we have induction principles. We chose to develop a floor function that allows Imandra to prove termination with reals.
To prove termination of our fold_rec function recursing along reals, we define an int_of_real : real > int function in Imandra, using a subsidiary floor : real > int > int which computes an integer floor of a real by “counting up” using its integer argument. In fact, as matrices have nonnegative dimensions, it suffices to only consider this conversion for nonnegative reals, and we formalise only this. We then have to prove some subsidiary lemmas about the arithmetic of realtointeger conversion, such as:
Armed with these results, we can then prove termination of fold_rec and admit it into Imandra’s logic via the ordinal pair measure below:
Extending the functional matrix implementation to reals was not trivial, but it did have a real payoff. Using this representation, we were able to verify realvalued versions of the pruned ACAS Xu networks! In both cases of integer and realvalued matrices, we pruned the networks to 10% of their original size. So, we still do not scale to the full ACAS Xu challenge. However, the positive news is that the realvalued version of the proofs uses the same waterfall proof tactic of Imandra, and requires no extra effort from the programmer to complete the proof. This result is significant bearing in mind that many functional and higherorder theorem provers are known to have significant drawbacks when switching to real numbers.
From the functional programming point of view, one may claim that this approach is not elegant enough because it does not provide true polymorphism as it encodes matrix dimensions as reals. This motivates us to try the third alternative, using records with maps to achieve polymorphism.
5.3 Records
Standard OCaml records are available in Imandra, though they do not support functions as fields. This is because all records are data values which must support a computable equality relation, and in general one cannot compute equality on functions. Internally in the logic, records correspond to algebraic data types with a single constructor and the record fields to named constructor arguments. Like product types, records allow us to group together values of different types, but with convenient accessors and update syntax based on field names, rather than position. This offers the possibility of polymorphism for our matrix type.
The approach here is similar to the one in Section 4: matrices are stored as mappings between indices and values, which allows for constanttime access to the elements. However, instead of having the mapping be implemented as a function, here we implement it as a Map, i.e. an unordered collection of (key;value) pairs where each key is unique, so that this “payload” can be included as the field of a record.
We can then use a convenient syntax to create a record of this type. For instance, a weights matrix from one of the ACAS Xu networks can be implemented as:
Note that the matrix dimensions (and the underlying map’s keys) are indeed encoded as integers, whereas the weights’ values are reals.
Similarly to the previous implementations, we define a number of useful matrix operations which will be used to define general neural network layer functions. For instance, the map2 function is defined thus:
Compared to the list implementation, this implementation has the benefit of providing constanttime access to matrix elements. However, compared to the implementation of matrices as functions, it uses recursion to iterate over matrix values which results in a high number of casesplits. This in turn results in lower scalability. Compared to the previous section’s results, none of the verification tests on pruned ACAS Xu benchmarks that terminated with the functional matrix implementation terminated with the records implementation.
Moreover, we can see in the above function definition that we lose considerable conciseness and readability.
In the end, the main interest of this implementation is its offering polymorphism. In all other regards, the functional implementation seems preferable.
6 Conclusions
Functional programming languages that are tightly coupled with automated reasoning capabilities, like Imandra, offer us the possibility to verify and perform inference with neural networks, which the library CheckINN aims to do. To that aim, implementing matrices and matrix operations is important. We have shown different implementations of matrices and how each implementation influences verification in Imandra.
This study has three positive conclusions:

Imandra’s language is sufficiently flexible to give rise to implementations of several choices of matrix in the CheckINN library. Its proof heuristics adapt smoothly to these different implementations, with very little hints needed to figure out the appropriate proof strategy (induction, waterfall or SAT/SMT proving).

this flexibility bears benefits when it comes to diversifying the range of NN properties we verify: thus, matrices as lists made possible proofs of higherorder properties by induction, whereas matrices as functions were more amenable to automated proofs in SAT/SMT solving style.

the transition from integervalued to realvalued NNs is possible in Imandra. This transition itself opens several choices for matrix representations. However, if the matrix representation is optimal for the task at hand, Imandra takes care of completing the proofs with reals and adapts its inductive waterfall method to the new data type automatically. This is a positive lesson to learn, as this is not always given in functional theorem provers.
The main drawback is our failure to scale to the full ACAS Xu problem regardless of the matrix implementation choice in CheckINN [6]. However, it may not come as a great surprise, as generalpurpose SMT solvers do not scale to the problem, either. It took domainspecific algorithms such as ReluPlex and specialpurpose solvers such as Marabou to overcome the scaling problem [12, 13]. This suggests future solutions that are somewhat orthogonal to the choice of the matrix representation:

interface with Marabou or other specialised NN solvers in order to scale;

work on a set of Imandra’s native proof heuristics and tactics, tailored specifically to Imandra’s NN formalisations.
In addition, evaluating CheckINN against other benchmarks would allow to assess more accurately its scalability on different problems, e.g. robustness verification of image classification networks on the MNIST dataset
[1] or range analysis of randomly generated networks [7]. We leave these as future work.These conclusions provide a strong foundation to further develop the CheckINN library, as its aim is to offer verification of a wide array of neural network properties and we have shown that the choice of matrix implementation eventually influences the range of verifiable properties.
Finally, we believe that the methods we described could be useful in other theorem provers (both first and higherorder) that combine functional programming and automated proof methods, such as ACL2, PVS, Isabelle/HOL and Coq. For example, in all these systems functions defining matrix operations (e.g., convolution) over lists are often more complex compared to their counterparts over matrices represented as functions, which can benefit from nonrecursive definitions. Overall, as these various prominent theorem proving systems work ultimately with functional programs over algebraic datatypes like Imandra, our core observations carry over to them in a natural way.
References
 [1] VNN 2022, https://sites.google.com/view/vnn2022
 [2] Ayers, E.W., Eiras, F., Hawasly, M., Whiteside, I.: PaRoT: A practical framework for robust deep neural network training. In: NASA Formal Methods  12th International Symposium, NFM 2020, Moffett Field, CA, USA, May 1115, 2020, Proceedings. LNCS, vol. 12229, pp. 63–84. Springer (2020)
 [3] Bagnall, A., Stewart, G.: Certifying true error: Machine learning in Coq with verified generalisation guarantees. AAAI (2019)
 [4] Boyer, R.S., Moore, J.S.: A Computational Logic. ACM Monograph Series. Academic Press, New York (1979)
 [5] Casadio, M., Komendantskaya, E., Daggitt, M.L., Kokke, W., Katz, G., Amir, G., Refaeli, I.: Neural network robustness as a verification property: A principled case study. In: Computer Aided Verification (CAV 2022). Lecture Notes in Computer Science, Springer (2022)
 [6] Desmartin, R., Passmore, G., Komendantskaya, E., Daggitt, M.L.: CNN library in Imandra. https://github.com/aisecprivate/ImandraNN (2022)
 [7] Dutta, S., Jha, S., Sankaranarayanan, S., Tiwari, A.: Output Range Analysis for Deep Feedforward Neural Networks. In: Dutle, A., Muñoz, C., Narkawicz, A. (eds.) NASA Formal Methods. pp. 121–138. Lecture Notes in Computer Science, Springer International Publishing, Cham (2018). https://doi.org/10.1007/9783319779355_9
 [8] Gehr, T., Mirman, M., DrachslerCohen, D., Tsankov, P., Chaudhuri, S., Vechev, M.T.: AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation. In: S&P (2018)
 [9] Grant, P.W., Sharp, J.A., Webster, M.F., Zhang, X.: Sparse matrix representations in a functional language. Journal of Functional Programming 6(1), 143–170 (Jan 1996). https://doi.org/10.1017/S095679680000160X, https://www.cambridge.org/core/journals/journaloffunctionalprogramming/article/sparsematrixrepresentationsinafunctionallanguage/669431E9C12EDC16F02603D833FAC31B, publisher: Cambridge University Press
 [10] Heras, J., Poza, M., Dénès, M., Rideau, L.: Incidence Simplicial Matrices Formalized in Coq/SSReflect. In: Davenport, J.H., Farmer, W.M., Urban, J., Rabe, F. (eds.) Intelligent Computer Mathematics. pp. 30–44. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg (2011). https://doi.org/10.1007/9783642226731_3
 [11] Huang, X., Kwiatkowska, M., Wang, S., Wu, M.: Safety verification of deep neural networks. In: Computer Aided Verification  29th International Conference, CAV 2017, Heidelberg, Germany, July 2428, 2017, Proceedings, Part I. Lecture Notes in Computer Science, vol. 10426, pp. 3–29 (2017)
 [12] Katz, G., Barrett, C., Dill, D., Julian, K., Kochenderfer, M.: Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks. In: CAV (2017)
 [13] Katz, G., Huang, D.A., Ibeling, D., Julian, K., Lazarus, C., Lim, R., Shah, P., Thakoor, S., Wu, H., Zeljic, A., Dill, D.L., Kochenderfer, M.J., Barrett, C.W.: The Marabou framework for verification and analysis of deep neural networks. In: CAV 2019, Part I. LNCS, vol. 11561, pp. 443–452. Springer (2019)
 [14] Kokke, W., Komendantskaya, E., Kienitz, D., Atkey, R., Aspinall, D.: Neural networks, secure by construction  an exploration of refinement types. In: Programming Languages and Systems  18th Asian Symposium, APLAS 2020, Fukuoka, Japan, November 30  December 2, 2020, Proceedings. Lecture Notes in Computer Science, vol. 12470, pp. 67–85. Springer (2020)
 [15] LeCun, Y., Denker, J., Solla, S.: Optimal Brain Damage. In: Advances in Neural Information Processing Systems. vol. 2. MorganKaufmann (1989), https://papers.nips.cc/paper/1989/hash/6c9882bbac1c7093bd25041881277658Abstract.html
 [16] Maria, E.D., Bahrami, A., L’Yvonnet, T., Felty, A.P., Gaffé, D., Ressouche, A., Grammont, F.: On the use of formal methods to model and verify neuronal archetypes. Frontiers Comput. Sci. 16(3), 163404 (2022)
 [17] Passmore, G.O., Cruanes, S., Ignatovich, D., Aitken, D., Bray, M., Kagan, E., Kanishev, K., Maclean, E., Mometto, N.: The imandra automated reasoning system (system description). In: Automated Reasoning  10th International Joint Conference, IJCAR 2020, Paris, France, July 14, 2020, Proceedings, Part II. vol. 12167, pp. 464–471. Springer (2020)
 [18] Passmore, G.O.: Some lessons learned in the industrialization of formal methods for financial algorithms. In: Formal Methods  24th International Symposium, FM 2021, Virtual Event, November 2026, 2021, Proceedings. Lecture Notes in Computer Science, vol. 13047, pp. 717–721. Springer (2021)
 [19] Sill, J.: Monotonic Networks. California Institute of Technology (1998)
 [20] Singh, G., Gehr, T., Püschel, M., Vechev, M.T.: An abstract domain for certifying neural networks. PACMPL 3(POPL), 41:1–41:30 (2019). https://doi.org/10.1145/3290354
 [21] Wehenkel, A., Louppe, G.: Unconstrained monotonic neural networks. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 814, 2019, Vancouver, BC, Canada. pp. 1543–1553 (2019)
 [22] Wood, J.: Vectors and Matrices in Agda (Aug 2019), https://personal.cis.strath.ac.uk/james.wood.100/blog/html/VecMat.html