1 Introduction
Artificial neural networks have been widely used in machine learning systems. Applications include adaptive control
[1, 2, 3, 4, 5, 6][7, 8], game playing [9], autonomous vehicles [10] ,and many others. Though neural networks have been showing effectiveness and powerful ability in resolving complex problems, they are confined to systems which comply only to the lowest safety integrity levels since, in most of time, a neural network is viewed as a black box without effective methods to assure safety specifications for its outputs. Neural networks are trained over a finite number of input and output data, and are expected to be able to generalize to produce desirable outputs for given inputs even including previously unseen inputs. However, in many practical applications, the number of inputs is essentially infinite, this means it is impossible to check all the possible inputs only by performing experiments and moreover, it has been observed that neural networks can react in unexpected and incorrect ways to even slight perturbations of their inputs [11], which could result in unsafe systems. Hence, methods that are able to provide formal guarantees are in a great demand for verifying specifications or properties of neural networks. Verifying neural networks is a hard problem, even simple properties about them have been proven NPcomplete problems [12]. The difficulties mainly come from the presence of activation functions and the complex structures, making neural networks largescale, nonlinear, nonconvex and thus incomprehensible to humans. Until now, only few results have been reported for verifying neural networks. The verification for feedforward multilayer neural networks is investigated based on Satisfiability Modulo Theory (SMT) in [13, 14]. In [15] an abstractionrefinement approach is proposed for verification of specific networks known asMultiLayer Perceptrons
(MLPs). In [12], a specific kind of activation functions called Rectified Linear Unit (ReLU) is considered for verification of neural networks. A simulationbased approach is developed in [16], which turns the reachable set estimation problem into a neural network maximal sensitivity computation problem that is described in terms of a chain of convex optimization problems. Additionally, some recent reachable set estimation results are reported for neural networks
[17, 18, 19], these results that are based on Lyapunov functions certainly have potentials to be further extended to safety verification.A neural network is comprised of a set of layers of neurons, with a linear combination of values from nodes in the preceding layer and applying an activation function to the result. These activation functions are usually nonlinear functions. In this work, we are going to focus on ReLU activation functions
[20], which is widely used in many neural networks [21, 22, 23, 13]. A ReLU function is basically piecewise linear. It returns zero when the node is with a negative value, implying the node is inactive. On the other hand, when the node is active with a positive value, the ReLU returns the value unchanged. This piecewise linearity allows ReLU neural networks have several advantages such as a faster training process and avoiding gradient vanishing problem. As to output reachable set computation and verification problems addressed in this paper, this piecewise linearity will also play a fundamental role in the computing procedure.The main contribution of this work is to develop an approach for computing output reachable set of ReLU neural networks and further applying to safety verification problems. Under the assumption that the initial set is described by a union of ployhedra, the output reachable set is computed layerbylayer via a set of manipulations of ployhedra. For a ReLU function, there are three cases from the view of the output vectors:

Case 1: All the elements in the input vector are positive, thus the output is exactly equivalent to the input;

Case 2: All the elements are nonpositive so that the ReLU function produces a zero vector according to the definition of ReLU function;

Case 3: The input vector has both positive and nonpositive elements. This is a much more intricate case which will be proved later that the outputs belong to a union of polyhedra, that is essentially nonconvex.
These above three caese are able to fully characterize the output behaviors of a ReLU function and form the basic idea of computing the output reachable set for neural networks comprised of ReLU neurons. With the above classifications and a complete reachability analysis for ReLU functions, the output reachable set of a ReLU layer can be obtained case by case and in the expression of a union of polyhedra. Then, the approach is generalized from a single layer to a neural network consisting of multiple layers for output reachable set computation of neural networks. Finally, the safety verification can be performed by checking if there is nonempty intersection between the output reachable set and unsafe regions. Since the output reachable set computed in this work is an exact one with respect to an input set, the verification results are sound for both safe and unsafe conclusions. The main benefit of our approach is that all the computation processes are formulated in terms of operations on polyhedra, which can be efficiently solved by existing tool for manipulations on polyhedra.
The remainder of the paper is organized as follows. The preliminaries for ReLU neural networks and problem formulations are given in Section II. The output reachability analysis for ReLU functions is studied in Section III. The main results, reachable set computation and verification for ReLU neural networks, are presented in Section IV. A numerical example is given in Section V to illustrate our approach, and we conclude in Section VI.
Notations: denotes the field of real numbers, stands for the vector space of all tuples of real numbers, is the space of matrices with real entries. denotes a blockdiagonal matrix. stands for infinity norm for vector defined as . denotes the transpose of matrix .
2 Preliminaries and Problem Formulation
This section presents the mathematical model of neural networks with ReLU activations considered in this paper and formulates the problems to be studied.
2.1 Neural Networks with ReLU Activations
A neural network consists of a number of interconnected neurons. Each neuron is a simple processing element that responds to the weighted inputs it received from other neurons. In this paper, we consider the most popular and general feedforward neural networks called the MultiLayer Perceptron (MLP). Generally, an MLP consists of three typical classes of layers: An input layer, that serves to pass the input vector to the network, hidden layers of computation neurons, and an output layer composed of at least a computation neuron to produce the output vector.
The action of a neuron depends on its activation function, which is described as
(1) 
where is the th input of the th neuron, is the weight from the th input to the th neuron, is called the bias of the th neuron, is the output of the th neuron, is the activation function. The activation function is generally a nonlinear function describing the reaction of th neuron with inputs , . Typical activation functions include rectified linear unit, logistic, tanh, exponential linear unit, linear functions, etc.
An MLP has multiple layers, each layer , , has neurons. In particular, layer is used to denote the input layer and stands for the number of inputs in the rest of this paper and thus stands for the last layer, that is the output layer. For a neuron , in layer , the corresponding input vector is denoted by and the weight matrix is
where
is the weight vector. The bias vector for layer
isThe output vector of layer can be expressed as
where is the activation function for layer .
For an MLP, the output of layer is the input of layer, and the mapping from the input of input layer to the output of output layer stands for the inputoutput relation of the MLP, denoted by
(2) 
where .
In this work, we aim at a class of activation functions called Rectified Linear Unit (ReLU), which is expressed as below:
(3) 
Thus, the output of a neuron considered in (1) can be rewritten as
(4) 
and the corresponding output vector of layer becomes
(5) 
in which the ReLU function is applied elementwise.
In the most of real applications, an MLP is usually viewed as a black box to generate a desirable output with respect to a given input. However, regarding property verifications such as the safety verification, it has been observed that even a welltrained neural network can react in unexpected and incorrect ways to even slight perturbations of their inputs, which could result in unsafe systems. Thus, the output reachable set estimation of an MLP, which is able to cover all possible values of outputs, is necessary for the safety verification of an MLP and draw a safe or unsafe conclusion for an MLP.
2.2 Problem Formulation
Given an input set , the reachable set of neural network (2) is stated by the following definition.
Definition 1
In our work, the input set is considered to be a union of polyhedra, that is expressed as , where , , are described by
(7) 
With respect to input set (7), the reachable set computation problem for neural network (2) with ReLU activations is given as below.
Problem 1
Then, we will focus on the safety verification for neural networks. The safety specification for output is expressed by a set defined in the output space, describing the safety requirement. For example, in accordance to input set, the safety region can be also considered as a union of polyhedra defined in output space as , where , , are given by
(8) 
The safety region in the form of (8) formalizes the safety requirements for output . If output belongs to safety region , we say the neural network is safe, otherwise, it is called unsafe.
Definition 2
Therefore, the safety verification problem for MLP with ReLU activations can be stated as follows.
Problem 2
The above two linked problems are the main concerns to be addressed in the rest of this paper. The crucial step is to find an efficient way to compute the output reachable set for a ReLU neural network with a given input set. In the next sections, the main results will be presented for the output reachable set computation and safety verification for ReLU neural networks.
3 Ouput Reach Set Computation of ReLU Functions
In this section, we consider the output reachable set of a single ReLU function with an input set . Before presenting the result, an indicator vector , , is introduced for the following derivation. In the indicator vector , the element is valuated as below:
Considering all the valuations of in , there are possible valuations in total, which are indexed as
Furthermore, each indicator vector from to are diagonalized and denoted as
Now, we are ready to compute the output reachable set of ReLU function with an input . For the input set, we have three cases listed below:

Case 1: All the elements are positive by the input in , that implies
According to the definition of ReLU function, the output set should be
(10) 
Case 2: All the elements in the outputs are nonpositive, which means
By the definition of ReLU, it directly leads to
(11) 
Case 3: The outputs have both positive and nonpositive elements, which corresponds to indicator vectors , . Note that, for each , , the element indicates due to . With respect to each , and noting , we define the set
where . In a compact form, it can be expressed as
Due to ReLU functions, when , it will be set to , thus the output for should be
Again, due to ReLU functions, the final value should be nonnegative, that is , thus this additional constraint has to be added for to have
As a result, the output reachable set is
(12)
An illustration for the above three cases is shown in Figure 1 with two dimensional input space. In Figure 1, (a) is for the Case 1, (b) is for the Case 2, (c) and (d) are for the Case 3. Summarizing the three cases for a ReLU function , the following proposition can be obtained.
Theorem 1
Proof. The proof can be straightforwardly obtained from the derivation of above three cases. Three cases completely characterize the behaviors of ReLU functions due to . For the case of , it produces , and leads to . As to , , the output reachable set is . Thus, the output reachable set is the union of output sets of three cases, that is .
Theorem 1 gives out a general result for output reachable set of a ReLU function, since there is no restriction imposed on input set . In the following, we consider the input set as a union of polyhedra described by , where , are given as
(14) 
Based on Theorem 1, the following result can be derived for input sets described by a union of polyhedra.
Theorem 2
Given a ReLU function with an input set in which , , is defined by (14), its output reachable set is
(15) 
where , , , are as follows
Proof. First, when , it has , thus the output set , which is
Then, if , where can be defined by
According to the definition of ReLU functions, it directly shows that the output set is
Finally, we consider , , which can be expressed by
where .
Adding the additional constraint that , set , , is expressed as
with .
By the fact of , , , it can be obtained that
Thus, based on Theorem 1, the output reachable set for input set is
Moreover, for input set , the output reachable set is , which implies (15) holds.
According to the result in Theorem 2, the following useful corollary can be derived before ending this subsection.
Corollary 1
Given a ReLU function , if the input set is a union of polyhedra, then the output reachable set is also a union of polyhedra.
Proof. By Theorem 2, , , , , , are all defined as polyhedra when input set is a union of polyhedra, thus is a union of polyhedra. The proof is complete.
4 Reach Set Computation and Verification for ReLU Neural Networks
Based on the reachability analysis for ReLU activation functions in previous section, we are ready for the main problems in this paper, the reachable set computation and verification problems for ReLU neural networks. First, ReLU neural networks can be expressed recursively in a layerbylayer form of
(16) 
where is the input set defined by a union of polyhedra as in (7). The input set and output set of layer are denoted as and , respectively.
Lemma 1
Proof. Consider input set of layer , which is a union of polyhedra, the following set which is an affine map of should be a union of polyhedra
Then, using Corollary 1 and , the output reachable set is a union of polyhedra.
Also from (16), because we have
The above procedure can be iterated from to to claim , , are all defined by a union of polyhedra.
By Lemma 1, the output sets of each layer are defined as a union of polyhedra, and due to , the input set of each layer can be represented as , in which is
(17) 
With regard to the input set of layer described by (17), the output reachable set can be obtained by the following theorem, which is the main result in this paper.
Theorem 3
Proof. The proof is briefly presented below since it is similar to the proof line for Theorem 2.
Like the proof line in Theorem 2, when which means , combining , it defines a set as
Moreover, due to , the output reachable set for is
Similarly, if , we have set defined by
Using ReLU function to get
Lastly, we consider the case that has both positive and nonpositive elements, that is , , where is expressed as follows
where
Furthermore, due to , an additional constraint should be added to to obtain as
where
Then, following the guidelines of Theorem 2, the output reachable set of the form (18) can be established.
As for linear activations, which are commonly used in the output layer, the output reachable set can be computed like the set for ReLU, without the constraint . The following corollary is given for linear layers.
Corollary 2
Consider a linear layer with input set defined by (17), the output reachable set of linear layer is
(19) 
where .
Proof. For an input , the linear relation implies that the output reachable set is . Moreover, because of , it directly leads to .
With the output reachable set computation results for both ReLU layers and linear layers, we are now ready to present the output reachable set computation results along with safety verification results summarized as functions OutputReLU, OutputReLUNetwork and VeriReLUNetwork presented in Algorithms 1, 2 and 3, respectively.
Comments
There are no comments yet.