1 Introduction
Artificial neural networks have been widely used in machine learning systems. Though neural networks have been showing effectiveness and powerful ability in resolving complex problems, they are confined to systems which comply only to the lowest safety integrity levels since, in most of time, a neural network is viewed as a
black box without effective methods to assure safety specifications for its outputs. Neural networks are trained over a finite number of input and output data, and are expected to be able to generalize to produce desirable outputs for given inputs even including previously unseen inputs. However, in many practical applications, the number of inputs is essentially infinite, this means it is impossible to check all the possible inputs only by performing experiments and moreover, it has been observed that neural networks can react in unexpected and incorrect ways to even slight perturbations of their inputs [1], which could result in unsafe systems. Hence, methods that are able to provide formal guarantees are in a great demand for verifying specifications or properties of neural networks. Verifying neural networks is a hard problem, even simple properties about them have been proven NPcomplete problems [2]. The difficulties mainly come from the presence of activation functions and the complex structures, making neural networks largescale, nonlinear, nonconvex and thus incomprehensible to humans.The importance of methods of formal guarantees for neural networks has been wellrecognized in literature. There exist a number of results for verification of feedforward neural networks, especially for Rectifier Linear Unit (ReLU) neural networks, and a few results are devoted to neural networks with broad classes of activation functions. Motivated to general class of neural networks such as those considered in
[3], our key contribution in this paper is to develop a specificationguided method for safety verification of feedforward neural network. First, we formulate the safety verification problem in the framework of interval arithmetic, and provide a computationally efficient formula to compute output interval sets. The developed formula is able to calculate the output intervals in a fast manner. Then, analogous to other stateoftheart verification methods, such as counterexampleguided abstraction refinement (CEGAR) [4] and property directed reachability (PDR) [5], and inspired by the MooreSkelboe algorithm [6], a specificationguided algorithm is developed. Briefly speaking, the safety specification is utilized to examine the existence of intersections between output intervals and unsafe regions and then determine the bisection actions in the verification algorithm. By making use of the information of safety specification, the computation cost can be reduced significantly. We provide experimental evidences to show the advantages of specificationguided approach, which shows that our approach only needs about 3%–7% computational cost of the method proposed in [3] to solve the same safety verification problem.2 Related Work
Many recent works are focusing on ReLU neural networks. In [2], an SMT solver named Reluplex is proposed for a special class of neural networks with ReLU activation functions. The Reluplex extends the wellknown Simplex algorithm from linear functions to ReLU functions by making use of the piecewise linear feature of ReLU functions. In [7], A layerbylayer approach is developed for the output reachable set computation of ReLU neural networks. The computation is formulated in the form of a set of manipulations for a union of polyhedra. A verification engine for ReLU neural networks called was proposed in [8]
. In their approach, the authors abstract perturbed inputs and safety specifications as zonotopes, and reason about their behavior using operations for zonotopes. An Linear Programming (LP)based method is proposed
[9], and in [10] authors encoded the constraints of ReLU functions as a MixedInteger Linear Programming (MILP). Combining output specifications that are expressed in terms of LP, the verification problem for output set eventually turns to a feasibility problem of MILP. In [11, 12], an MILP based verification engine called Sherlock that performs an output range analysis of ReLU feedforward neural networks is proposed, in which a combined local and global search is developed to more efficiently solve MILP.Besides the results for ReLU neural networks, there are a few other results for neural networks with general activation functions. In [13, 14], a piecewiselinearization of the nonlinear activation functions is used to reason about their behaviors. In this framework, the authors replace the activation functions with piecewise constant approximations and use the bounded model checker hybrid satisfiability (HySAT) [15] to analyze various properties. In their papers, the authors highlight the difficulty of scaling this technique and, currently, are only able to tackle small networks with at most 20 hidden nodes. In [16], the authors proposed a framework for verifying the safety of network image classification decisions by searching for adversarial examples within a specified region. A adaptive nested optimization framework is proposed for reachability problem of neural networks in [17]. In [3]
, a simulationbased approach was developed, which used a finite number of simulations/computations to estimate the reachable set of multilayer neural networks in a general form. Despite this success, the approach lacks the ability to resolve the reachable set computation problem for neural networks that are largescale, nonconvex, and nonlinear. Still, simulationbased approaches, like the one developed in
[3], present a plausibly practical and efficient way of reasoning about neural network behaviors. The critical step in improving simulationbased approaches is bridging the gap between finitely many simulations and the essentially infinite number of inputs that exist in the continuity set. Sometimes, the simulationbased approach requires a large number of simulations to obtain a tight reachable set estimation, which is computationally costly in practice. In this paper, our aim is to reduce the computational cost by avoiding unnecessary computations with the aid of a specificationguided method.3 Background
3.1 Feedforward Neural Networks
Generally speaking, a neural network consists of a number of interconnected neurons and each neuron is a simple processing element that responds to the weighted inputs it received from other neurons. In this paper, we consider feedforward neural networks, which generally consist of one input layer, multiple hidden layers and one output layer. The action of a neuron depends on its activation function, which is in the form of
(1) 
where is the th input of the th neuron, is the weight from the th input to the th neuron, is called the bias of the th neuron, is the output of the th neuron, is the activation function. The activation function is generally a nonlinear continuous function describing the reaction of th neuron with inputs , . Typical activation functions include ReLU, logistic, tanh, exponential linear unit, linear functions, for instance. In this work, our approach aims at being capable of dealing with activation functions regardless of their specific forms.
A feedforward neural network has multiple layers, and each layer , , has neurons. In particular, layer is used to denote the input layer and stands for the number of inputs in the rest of this paper. For the layer
, the corresponding input vector is denoted by
and the weight matrix is(2) 
where
is the weight vector. The bias vector for layer
is(3) 
The output vector of layer can be expressed as
(4) 
where is the activation function of layer .
The output of layer is the input of layer, and the mapping from the input of input layer, that is , to the output of output layer, namely , stands for the inputoutput relation of the neural network, denoted by
(5) 
where .
3.2 Problem Formulation
We start by defining the neural network output set that will become of interest all through the rest of this paper.
Definition 3.1
The safety specification of a neural network is expressed by a set defined in the output space, describing the safety requirement.
Definition 3.2
Safety specification formalizes the safety requirements for output of neural network (5), and is a predicate over output of neural network (5). The neural network (5) is safe if and only if the following condition is satisfied:
(7) 
where is the output set defined by (6), and is the symbol for logical negation.
The safety verification problem for the neural network (5) is stated as follows.
Problem 3.1
The key for solving the safety verification Problem 3.1 is computing output set . However, since neural networks are often nonlinear and nonconvex, it is extremely difficult to compute the exact output set . Rather than directly computing the exact output set for a neural network, a more practical and feasible way for safety verification is to derive an overapproximation of .
Definition 3.3
A set is an overapproximation of if holds.
The following lemma implies that it is sufficient to use the overapproximated output set for the safety verification of a neural network.
Lemma 3.1
Consider a neural network in the form of (5) and a safety specification , the neural network is safe if the following condition is satisfied
(8) 
where .
Proof. Due to , (8) implies .
From Lemma 3.1, the problem turns to how to construct an appropriate overapproximation . One natural way, as the method developed in [3], is to find a set as small as possible to tightly overapproximate output set and further perform safety verification. However, this idea sometimes could be computationally expensive, and actually most of computations are unnecessary for safety verification. In the following, a specificationguided approach will be developed, and the overapproximation of output set is computed in an adaptive way with respect to a given safety specification.
4 Safety Verification
4.1 Preliminaries and Notation
Let , be real compact intervals and be one of the basic operations addition, subtraction, multiplication and division, respectively, for real numbers, that is , where it is assumed that in case of division. We define these operations for intervals and by . The width of an interval is defined and denoted by . The set of compact intervals in is denoted by . We say is an interval extension of function , if for any degenerate interval arguments, agrees with such that . In order to consider multidimensional problems where is taken into account, we denote , where denotes the set of compact interval in . The width of an interval vector is the largest of the widths of any of its component intervals . A mapping denotes the interval extension of a function . An interval extension is inclusion monotonic if, for any , implies . A fundamental property of inclusion monotonic interval extensions is that , which means the value of is contained in the interval for every in .
Several useful definitions and lemmas are presented.
Definition 4.1
[18] Piecewise monotone functions, including exponential, logarithm, rational power, absolute value, and trigonometric functions, constitute the set of standard functions.
Lemma 4.1
[18] A function which is composed by finitely many elementary operations and standard functions is inclusion monotone.
Definition 4.2
[18] An interval extension is said to be Lipschitz in if there is a constant such that for every .
Lemma 4.2
[18] If a function satisfies an ordinary Lipschitz condition in ,
(9) 
then the interval extension is a Lipschitz interval extension in ,
(10) 
The following trivial assumption is given for activation functions.
Assumption 4.1
The activation function considered in this paper is composed by finitely many elementary operations and standard functions.
Based on Assumption 4.1, the following result can be obtained for a feedforward neural network.
Theorem 4.1
The interval extension of neural network composed by activation functions satisfying Assumption 4.1 is inclusion monotonic and Lipschitz such that
(11) 
where is a Lipschitz constant for all activation functions in .
4.2 Interval Analysis
First, we consider a single layer . Given an interval input , the interval extension is , where
(12)  
(13) 
To compute the interval extension , we need to compute the minimum and maximum values of the output of nonlinear function . For general nonlinear functions, the optimization problems are still challenging. Typical activation functions include ReLU, logistic, tanh, exponential linear unit, linear functions, for instance, satisfy the following monotonic assumption.
Assumption 4.2
For any two scalars , the activation function satisfies .
Assumption 4.2 is a common property that can be satisfied by a variety of activation functions. For example, it is easy to verify that the most commonly used such as logistic, tanh, ReLU, all satisfy Assumption 4.2. Taking advantage of the monotonic property of , the interval extension . Therefore, and in (12) and (13) can be explicitly written out as
(14)  
(15) 
with and defined by
(16)  
(17) 
From (14)–(17), the output interval of a single layer can be efficiently computed with these explicit expressions. Then, we consider the feedforward neural network with multiple layers, the interval extension can be computed by the following layerbylayer computation.
Theorem 4.2
Proof. We denote . For a feedforward neural network, it essentially has , which leads to (18). Then, for each layer, the interval extension computed by (19)–(22) can be obtained directly from (14)–(17).
We denote the set image for neural network as follows
(23) 
Since is inclusion monotonic according to Theorem 4.1, one has . Thus, it is sufficient to claim the neural network is safe if holds by Lemma 3.1.
According to the explicit expressions (18)–(22), the computation on interval extension is fast. In the next step, we should discuss the conservativeness for the computation outcome of (18). We have for some intervalvalued function with .
Definition 4.3
We call the excess width of interval extension of neural network .
Explicitly, the excess width measures the conservativeness of interval extension regarding its corresponding function . The following theorem gives the upper bound of the excess width .
Theorem 4.3
Consider feedforward neural network (5) with an interval input , the excess width satisfies
(24) 
where .
Given a neural network which means and are fixed, Theorem 4.3 implies that a less conservative result can be only obtained by reducing the width of input interval . On the other hand, a smaller means more subdivisions of an input interval which will bring more computational cost. Therefore, how to generate appropriate subdivisions of an input interval is the key for safety verification of neural networks in the framework of interval analysis. In the next section, an efficient specificationguided method is proposed to address this problem.
4.3 SpecificationGuided Safety Verification
Inspired by the MooreSkelboe algorithm [6], we propose a specificationguided algorithm, which generates fine subdivisions particularly with respect to specification, and also avoid unnecessary subdivisions on the input interval for safety verification, see Algorithm 1.
The implementation of the specificationguided algorithm shown in Algorithm 1 checks that the intersection between output set and unsafe region is empty, within a predefined tolerance . This is accomplished by dividing and checking the initial input interval into increasingly smaller subintervals.

Initialization. Set a tolerance . Since our approach is based on interval analysis, convert input set to an interval such that . Compute the initial output interval . Initialize set .

Specificationguided bisection. This is the key in the algorithm. Select an element for specificationguided bisection. If the output interval of subinterval has no intersection with the unsafe region, we can discard this subinterval for the subsequent dividing and checking since it has been proven safe. Otherwise, the bisection action will be activated to produce finer subdivisions to be added to for subsequent checking. The bisection process is guided by the given safety specification, since the activations of bisection actions are totally determined by the nonemptiness of the intersection between output interval sets and the given unsafe region. This distinguishing feature leads to finer subdivisions when the output set is getting close to the unsafe region, and on the other hand coarse subdivisions are sufficient for safety verification when the output set is far wary from the unsafe area. Therefore, unnecessary computational cost can be avoided. In the experiments section, it will be clearly observed how the bisection actions are guided by safety specification in a numeral example.

Termination. The specificationguided bisection procedure continues until which means all subintervals have been proven safe, or the width of subdivisions becomes less than the predefined tolerance which leads to an uncertain conclusion for the safety. Finally, when Algorithm 1 outputs an uncertain verification result, we can select a smaller tolerance to perform the safety verification.
5 Experiments
5.1 Random Neural Network
To demonstrate how the specificationguided idea works in safety verification, a neural network with two inputs and two outputs is proposed. The neural network has 5 hidden layers, and each layer contains 10 neurons. The weight matrices and bias vectors are randomly generated. The input set is assumed to be and the unsafe region is .
Intervals  Computational Time  

Algorithm 1  4095  21.45 s 
Xiang et al. 2018  111556  294.37 s 
We execute Algorithm 1 with termination parameter , the safety can be guaranteed by partitioning into 4095 interval sets. The specificationguided partition of the input space is shown in Figure 1. A nonuniform input space partition is generated based on the specificationguided scheme. An obvious specificationguided effect can be observed in Figure 1. The specificationguided method requires much less computational complexity compared to the approach in [3] which utilizes a uniform partition of input space, and a comparison is listed in Table 1. The computation is carried out using Matlab 2017 on a personal computer with Windows 7, Intel Core i54200U, 1.6GHz, 4 GB RAM. It can be seen that the number of interval sets and computational time have been significantly reduced to 3.67% and 7.28%, respectively, compared to those needed in [3]. Figure 2 illustrates the union of 4095 output interval sets, which has no intersection with the unsafe region, illustrating the safety specification is verified. Figure 2 shows that the output interval estimation is guided to be tight when it comes close to unsafe region, and when it is far way from the unsafe area, a coarse estimation is sufficient to verify safety.
5.2 Robotic Arm Model
In [3], a learning forward kinematics of a robotic arm model with two joints is proposed, shown in Figure 3. The learning task is using a feedforward neural network to predict the position of the end with knowing the joint angles . The input space for
is classified into three zones for its operations: normal working zone
, buffering zone and forbidden zone . The detailed formulation for this robotic arm model and neural network training can be found in [3].The safety specification for the position is . The input set of the robotic arm is the union of normal working and buffering zones, that is . In the safety point of view, the neural network needs to be verified that all the outputs produced by the inputs in the normal working zone and buffering zone will satisfy safety specification . In [3], a uniform partition for input space is used, and thus 729 intervals are produced to verify the safety property. Using our specificationguided approach, the safety can be guaranteed by partitioning the input space into only 15 intervals, see Figure 4 and Figure 5. Due to the small number of intervals involved in the verification process, the computational time is only 0.27 seconds for specificationguided approach.
5.3 Handwriting Image Recognition
In this handwriting image recognition task, we use 5000 training examples of handwritten digits which is a subset of the MNIST handwritten digit dataset (http://yann.lecun.com/exdb/mnist/), examples from the dataset are shown in Figure 6
. Each training example is a 20 pixel by 20 pixel grayscale image of the digit. Each pixel is represented by a floating point number indicating the grayscale intensity at that location. We first train a neural network with 400 inputs, one hidden layer with 25 neurons and 10 output units corresponding to the 10 digits. The activation functions for both hidden and output layers are sigmoid functions. A trained neural network with about 97.5% accuracy is obtained.
Under adversarial perturbations, the neural network may produce a wrong prediction. For example in Figure 7(a) which is an image of digit , the label predicted by the neural network will turn to as a perturbation belonging to attacks the lefttop corner of the image. With our developed verification method, we wish to prove that the neural network is robust to certain classes of perturbations, that is no perturbation belonging to those classes can alter the prediction of the neural network for a perturbed image. Since there exists one adversarial example for perturbations at the lefttop corner, it implies this image is not robust to this class of perturbation. We consider another class of perturbations, perturbations at the lefttop corner, see Figure 7(b). Using Algorithm 1, the neural network can be proved to be robust to all perturbations located at at the lefttop corner of the image, after 512 bisections.
Moreover, applying Algorithm 1 to all 5000 images with perturbations belonging to and located at the lefttop corner, it can be verified that the neural network is robust to this class of perturbations for all images. This result means this class of perturbations will not affect the prediction accuracy of the neural network. The neural network is able to maintain its 97.5% accuracy even subject to any perturbations belonging to this class of perturbations.
6 Conclusion and Future Work
In this paper, we introduce a specificationguided approach for safety verification of feedforward neural networks with general activation functions. By formulating the safety verification problem into the framework of interval analysis, a fast computation formula for calculating output intervals of feedforward neural networks is developed. Then, a safety verification algorithm which is called specificationguided is developed. The algorithm is specificationguided since the activation of bisection actions are totally determined by the existence of intersections between the computed output intervals and unsafe sets. This distinguishing feature makes the specificationguided approach be able to avoid unnecessary computations and significantly reduce the computational cost. Several experiments are proposed to show the advantages of our approach.
Though our approach is general in the sense that it is not tailored to specific activation functions, the specificationguided idea has potential to be further applied to other methods dealing with specific activation functions such as ReLU neural networks to enhance their scalability. Moreover, since our approach can compute the output intervals of a neural network, it can be incorporated with other reachable set estimation methods to compute the dynamical system models with neural network components inside such as extension of [3] to closedloop systems [19] and neural network models of nonlinear dynamics [20].
References
 [1] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” in International Conference on Learning Representations, 2014.
 [2] G. Katz, C. Barrett, D. Dill, K. Julian, and M. Kochenderfer, “Reluplex: An efficient SMT solver for verifying deep neural networks,” in International Conference on Computer Aided Verification, pp. 97–117, Springer, 2017.
 [3] W. Xiang, H.D. Tran, and T. T. Johnson, “Output reachable set estimation and verification for multilayer neural networks,” IEEE Transactions on Neural Network and Learning Systems, vol. 29, no. 11, pp. 5777–5783, 2018.
 [4] E. Clarke, O. Grumberg, S. Jha, Y. Lu, and H. Veith, “Counterexampleguided abstraction refinement,” in International Conference on Computer Aided Verification, pp. 154–169, Springer, 2000.
 [5] N. Een, A. Mishchenko, and R. Brayton, “Efficient implementation of property directed reachability,” in Proceedings of the International Conference on Formal Methods in ComputerAided Design, pp. 125–134, FMCAD Inc, 2011.
 [6] S. Skelboe, “Computation of rational interval functions,” BIT Numerical Mathematics, vol. 14, no. 1, pp. 87–95, 1974.
 [7] W. Xiang, H.D. Tran, and T. T. Johnson, “Reachable set computation and safety verification for neural networks with ReLU activations,” arXiv preprint arXiv: 1712.08163, 2017.
 [8] T. Gehr, M. Mirman, D. DrachslerCohen, P. Tsankov, S. Chaudhuri, and M. Vechev, “: Safety and robustness certification of neural networks with abstract interpretation,” in Security and Privacy (SP), 2018 IEEE Symposium on, 2018.
 [9] R. Ehlers, “Formal verification of piecewise linear feedforward neural networks,” in International Symposium on Automated Technology for Verification and Analysis, pp. 269–286, Springer, 2017.
 [10] A. Lomuscio and L. Maganti, “An approach to reachability analysis for feedforward ReLU neural networks,” arXiv preprint arXiv:1706.07351, 2017.
 [11] S. Dutta, S. Jha, S. Sanakaranarayanan, and A. Tiwari, “Output range analysis for deep neural networks,” arXiv preprint arXiv:1709.09130, 2017.
 [12] S. Dutta, S. Jha, S. Sankaranarayanan, and A. Tiwari, “Output range analysis for deep feedforward neural networks,” in NASA Formal Methods Symposium, pp. 121–138, Springer, 2018.
 [13] L. Pulina and A. Tacchella, “An abstractionrefinement approach to verification of artificial neural networks,” in International Conference on Computer Aided Verification, pp. 243–257, Springer, 2010.
 [14] L. Pulina and A. Tacchella, “Challenging SMT solvers to verify neural networks,” AI Communications, vol. 25, no. 2, pp. 117–135, 2012.
 [15] M. Fränzle and C. Herde, “HySAT: An efficient proof engine for bounded model checking of hybrid systems,” Formal Methods in System Design, vol. 30, no. 3, pp. 179–198, 2007.
 [16] X. Huang, M. Kwiatkowska, S. Wang, and M. Wu, “Safety verification of deep neural networks,” in International Conference on Computer Aided Verification, pp. 3–29, Springer, 2017.
 [17] W. Ruan, X. Huang, and M. Kwiatkowska, “Reachability analysis of deep neural networks with provable guarantees,” arXiv preprint arXiv:1805.02242, 2018.
 [18] R. E. Moore, R. B. Kearfott, and M. J. Cloud, Introduction to interval analysis, vol. 110. Siam, 2009.
 [19] W. Xiang, D. M. Lopez, P. Musau, and T. T. Johnson, “Reachable set estimation and verification for neural network models of nonlinear dynamic systems,” in Safe, Autonomous and Intelligent Vehicles, pp. 123–144, Springer, 2019.
 [20] W. Xiang, H. Tran, J. A. Rosenfeld, and T. T. Johnson, “Reachable set estimation and safety verification for piecewise linear systems with neural network controllers,” in 2018 Annual American Control Conference (ACC), pp. 1574–1579, June 2018.
Comments
There are no comments yet.