Towards Safety Verification of Direct Perception Neural Networks

04/09/2019 ∙ by Chih-Hong Cheng, et al. ∙ Audi fortiss 0

We study the problem of safety verification of direct perception neural networks, which take camera images as inputs and produce high-level features for autonomous vehicles to make control decisions. Formal verification of direct perception neural networks is extremely challenging, as it is difficult to formulate the specification that requires characterizing input conditions, while the number of neurons in such a network can reach millions. We approach the specification problem by learning an input property characterizer which carefully extends a direct perception neural network at close-to-output layers, and address the scalability problem by only analyzing networks starting from shared neurons without losing soundness. The presented workflow is used to understand a direct perception neural network (developed by Audi) which computes the next waypoint and orientation for autonomous vehicles to follow.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Using deep neural networks has been the de facto choice for developing visual object detection function in automated driving. Nevertheless, in the autonomous driving workflow, the neural networks can also be used more extensively. An example is direct perception [2]; it trains a neural network to read high-dimensional inputs (such as images from camera or point clouds from lidar) and directly produces low-dimensional information called affordances (e.g., safe maneuver regions or the next waypoint to follow) which could be used to program a controller for the autonomous vehicle. One may use direct perception as a hot standby system of a classical mediated perception system that extracts objects and identifies lane markings before affordances are produced.

In this paper, we study the safety verification problem for a neural network implementing direct perception, where the goal is to ensure that under certain input conditions, the undesired output values never occur. An example of such a kind is “If in the input image, the road strongly bends to the right, the output of the neural network should never suggest to strongly steer to the left”. Overall, the safety verification problem is fundamentally challenging due to two factors:

  • (Specification) To perform safety verification, one premise is to have the undesired property formally specified. Nevertheless, it is practically impossible to characterize input specifications from images such as “road strongly bends to the right” and represent them as constraints over input variables.

  • (Scalability) Neural networks for deep perception often take images with millions of pixels, and the internal structure of the network can have many layers. This makes any formal analysis framework fundamentally challenging.

Figure 1: High-level illustration how to perform safety verification while tackling specification and scalability issues.

Towards these issues, we present a workflow for safety verification of direct perception neural networks by considering both the specification and the scalability problem. For the ease of understanding, we use Figure 1 to explain the concept. First, we address the specification problem by learning an input property characterizer network, where the input of the network is connected to close-to-output layer neurons of the original direct perception network. In Figure 1, the input property characterizer takes output values from the neurons , , , and in the original deep perception network. For the previously mentioned specification, the input property characterizer outputs true if an input image has “road strongly bending to the right”. By doing so, the characterization of input features is aggregated to an output of a neural network. Subsequently, the safety verification problem is approached by asking if it is possible for the input-characterizing network to output true, but the output of the direct perception network demonstrates undesired values. As both the deep perception network and the input-characterizing network have shared neuron values, safety verification can be approached by only verifying close-to-output layers without losing the soundness. In Figure 1, safety verification only analyzes the sub-network colored grayed, and examines if any assignment of , , , , and leads to undesired output. The bounds of the neurons , , , , and are either decided by static analysis (which guarantees an overly conservative bound) or by creating an outer polyhedron that aggregates all visited neuron values. The latter method is particularly suitable when static analysis fails to prove the absence of error. However, it may lead to under-approximation and one needs to monitor in runtime if the computed neuron values fall out of the polyhedron. In Figure 1, the bound of  to be used in verification, by observing the minimum and the maximum of all visited values , will be and shall be monitored in operation.

The rest of the paper is organized as follows. Section 2 presents the required definitions as well as the workflow for verification. Section 3 discusses extensions to a statistical setup when the input property characterizer is not perfect. Lastly, we summarize related work in Section 4 and conclude with our preliminary evaluation in Section 5.

2 Verification Workflow

A deep neural network is comprised of layers where operationally, the -th layer for of the network is a function , with being the dimension of layer . Given an input , the output of the -th layer of the neural network is given by the functional composition of the -th layer and the previous layers .

2.1 Characterizing Input Specification from Examples

Let be the set of inputs of a neural network that satisfies the property . We assume that both  and  are unknown (e.g., the road is bending left in an image), but there exists an oracle (e.g., human) that can answer for a given input , whether .

Let be the list of training data and their associated labels (generated by the oracle) related to the input property , where for every , , , we have if and if . The perfect input property characterizer extending the -th layer is a function  which guarantees that for every , . The generation of 

can be done by training a neural network as a binary classifier, with 

success rate on the training data. The following assumption states that as long as function  performs perfectly on the training data, will also perfectly generalize to the complete input space. In other words, we can use to characterize .

Assumption 1 (Perfect Generalization)

Assume that also perfectly characterizes , i.e., : iff .

Definition 1 (Safety Verification)

The safety verification problem asks if there exists an input  such that  satisfies , where the risk condition  is a conjunction of linear inequalities over the output of the neural network. If no such input in exists, we say that the neural network is safe under the input constraint  and the output risk constraint .

When Assumption 1 holds, for safety verification it is equivalent to ask whether there exists an input  such that  and  satisfies . From now on, unless explicitly specified, we consider only situations where Assumption 1 holds.

2.2 Practical Safety Verification

Abstraction by omitting neurons before the -th layer.

The following result states that one can retain soundness for safety verification, by considering all possible neuron values that can appear in the -th layer.

Lemma 1 (Verification by Layer Abstraction)

If there exists no such that satisfies  and , then the neural network is safe under input constraint  and output risk constraint .


The lemma holds because for every input  of the network, . ∎

Obviously, the use of  in Lemma 1 is overly conservative, and we can strengthen Lemma 1 without losing soundness, if we find which guarantees that for every input  of the network. Obtaining such a set  can be achieved by abstract interpretation techniques [6, 20] which perform symbolic reasoning over the neural network in a layer-wise manner.

Lemma 2 (Abstraction via Input Over-approximation)

Let guarantee that for every input  of the network. If there exists no such that satisfies  and , then the neural network is safe under input constraint  and output risk constraint .

Assume-guarantee Verification via Monitoring.

If the computed , due to over-approximation, is too coarse to prove safety, one practical alternative is to generate which only guarantees for every input  in the training data. In other words, over-approximates the neuron values computed based on the samples in the training data.

If using is sufficient to prove safety and if for any input in, checking whether can be computed efficiently, one can conditionally accept the proof by designing a run-time monitor which raises a warning that the assumption used in the proof is violated. Admittedly, can be an under-approximation over , but practically creating an over-approximation only based on the training data is useful and can avoid unstructured input such as noise which is allowed when using .

3 Towards Statistical Reasoning

The results in Section 2 are based on two assumptions of perfection, namely

  • (perfect training) the input property characterizer perfectly decides whether property  holds, for each sample in the training data, and

  • (perfect generalization) the input property characterizer generalizes its decision (whether property  holds) also perfectly to every data point in the complete input space.

One important question appears when the above two assumptions do not hold, meaning that it is possible for the input property characterizer to make mistakes. By considering all four possibilities in Table 1, one realizes that even when a safety proof is established by considering all inputs where

, there exists a probability 

where an input in should be analyzed, but in is omitted in the proof process due to being 0 (i.e., and ). Therefore, one can only establish a statistical guarantee with
probability over the correctness claim111Note that for parts where and , no problem occurs as the safety analysis guarantees the desired property when ., provided that all data points used in training are also safe222In other words, for every , if and , then does not satisfy ..

Table 1: Probability by considering all possible cases due to decisions made by the input characterizer (whether ) and the ground truth (whether ).

4 Related Work

Formal verification of neural networks has drawn huge attention with many results available [13, 8, 3, 5, 9, 11, 6, 4, 1, 14, 19, 20]. Although specifications used in formal verification of neural networks are discussed in recent reviews [7, 15], the specification problem over images is not addressed, so research results largely use inherent properties of a neural network such as local robustness (as output invariance) or output ranges where one does not need to characterize properties for desired inputs. Not being able to properly characterizing input conditions (one possibility is to simply consider every input to be bounded by ) makes it difficult for formal static analysis to achieve any useful results on deep perception networks, regardless of the type of abstraction domain being used (box, octagon, or zonotope). Lastly, our work is motivated by zero shot learning [12] which trains additional features apart from a standard neural network. The feature detector is commonly created by extending the network from close-to-output layers.

5 Evaluation and Concluding Remarks

We have applied this methodology to examine a direct perception neural network developed by Audi. The network acts as a hot standby system and computes the next waypoint and orientation for autonomous vehicles to follow. As the close-to-output layers of the network are either ReLU or Batch Normalization, and as

is a conjunction of linear constraints over output, it is feasible to use MILP-based approaches [3] as the underlying verification method. We developed a variation of nn-dependability-kit333

to read models from tensorflow

444 and to perform formal verification. Using assume-guarantee based techniques that take an over-approximation from neuron values produced by the training data555The data is taken from a particular segment of the German A9 highway, by considering variations such as weather and the current lane., it is possible to conditionally prove some properties such as “impossibility to suggest steering to the far left, when the road image is bending to the right”. However, under the current setup, it is still impossible to prove intriguing properties such as “impossibility to suggest steering straight, when the road image is bending to the right”. We suspect that the main reason is due to the inherent limitation of the neural network under analysis.

In our experiment, we also found that for some input properties such as traffic participants in adjacent lanes, it is very difficult to construct the corresponding input property characterizers by taking neuron values from close-to-output layers (i.e., the trained classifier almost acts like fair coin flipping). Based on the theory of information bottleneck for neural networks [18, 16], a neural network from high dimensional input to low dimensional output naturally eliminates unrelated information in close-to-output layers. Therefore, the input property can be unrelated to the output of the network. Although we are unable to prove that the output of the network is safe under these input constraints, it should be possible to construct a counter example either by capturing more data or by using adversarial perturbation techniques [17, 10].

Overall, our initial result demonstrates the potential of using formal methods even on very complex neural networks, while it provides a clear path to engineers to resolve the problem related to how to characterize input conditions for verification (by also applying machine learning techniques). Our approach of looking at close-to-output layers can be viewed as an abstraction which can, in future work, leads to layer-wise incremental abstraction-refinement techniques. Although our practical motivation is to verify direct perception networks, the presented technique is equally applicable to any deep network for vision systems where input conditions are hard to characterize. It opens a new research direction of using learning to assist practical verification of learning systems.

5.0.1 Acknowledgement

The research work is supported by the following projects: “Audi Verifiable AI” from Audi AG, Germany and “Dependable AI for automotive systems” from DENSO Corporation, Japan.