1 Introduction
Autonomy is increasingly preva;ent in many applications, ranging from recommendation systems to fully autonomous vehicles, that require strong safety assurance gurantees. However, this is difficult to achieve, since autonomous sytems are large, complex systems, that operate in uncertain environment conditions and often use datadriven, machinelearning algorithms. Machinelearning techniques such as deep neural nets (DNN), widely used today, are inherently unpredictable and lack the theoretical foundations to provide the assurance guarantees needed by safetycritical applications. Current assurance approaches involve design and testing procedures that are expensive and inadequate, as they have been developed mostly for humanintheloop systems and do not apply to systems with advanced autonomy.
We propose a compositional approach for the scalable verification of learningenabled autonomous systems to achieve designtime assurance guarantees. The approach is illustrated in Figure 1. The input to the framework is the design model of an autonomous system (this could be given as e.g. Simulink/Stateflow or prototype implemntation). As the verification of the system as a whole is likely intractable we advocate the use of compositional assumeguarantee verification whereby formally defined contracts allow the designer to model and reason about learningenabled components working sidebyside with the other components in the system. These contracts encode the properties guaranteed by the component and the environment assumptions under which these guarantees hold. The framework will then use compositional reasoning to decompose the verification of large systems into the more manageable verification of individual components, which are formally checked against their respective assumeguarantee contracts. The approach enables separate component verification with specialized tools (e.g. one can use software model checking for a dicretetime controller but hybrid model checking for the plant component in an autonomous sytem) and seamless integration of DNN analysis results.
For DNN analysis, we proposde to use clustering techniques to automatically discover safe regions where the networks behave in a predictable way. The evidence obtained from this analysis is conditional, subject to constraints defined by the safe regions, and is encoded in the assumeguarantee contracts. The contracts allow us to relate the DNN behavior to the validity of the systemlevel requirements, using compositional model checking. We illustrate the approach on an example of an autonomous vehicle that uses DNN in the perception module.
2 Compositional Verification
Formal methods provide a rigorous way of obtaining strong assurance guarantees of computing systems. There are several challenges to formally modeling and verifying autonomous systems. Firstly, such systems comprise of many heterogeneous components; each with different implementations and requirements, which can be addressed best with different verification models and techniques. Secondly, the state space of such systems is very large
. Suppose we could model all the components of such a system as formally specified (hybrid) models; even ignoring the learning aspect, their composition would likely be intractable. The DNN components make the scalability problem even more serious: for example the feature space of RGB 1000X600px pictures for an image classifier used in the perception module of an autonomous vehicle contains 256
elements. Last but not the least, it is not clear how to formally reason about the DNN components as there is no clear consensus in the research community on a formal definition of correctness for the underlying machine learning algorithms.We propose a compositional assumeguarantee verification approach for the scalable verification of autonomous systems where DNN components are working sideby side with the other components. Compositional verification frameworks have been proposed before to improve the reliability and predictability of CPS [1, 17, 4, 5], but none of these works address systems that include DNN components. Recent work [6] proposes a compositional framework for the the analysis of autonomous systems with DNN components. However, that approach addresses falsification in such systems and, while that is very useful for debugging, it is not clear how it can be used to provide assurance guarantees.
Assumeguarantee reasoning attempts to break up the verification of a large system into the local verification of individual components, using assumptions about the rest of the system. The simplest assumeguarantee rule first checks that a component satisfies a property under an assumption (this can be written as ). If the “environment" of (i.e., the rest of the system in which operates) satisfies (written as ), then we can prove that the whole system composed of and satisfies . Thus we can decompose the global property into two local assumeguarantee properties (i.e., contracts) and that are expected to hold on and respectively. Other, more involved, rules allow reasoning about the circular dependencies between components, where the assumption for one component is used as the guarantee of the other component and vice versa; if the conjunction of the assumptions implies the specification than the overall system guarantees the systemlevel requirement. Rules that involve circular reasoning use inductive arguments, over time, formulas to be checked, or both, to ensure soundness. Furthermore, the rules can be naturally generalized to reasoning about more than two components and use different notions for property satisfaction such as trace inclusion or refinement checking.
The main challenge with assumeguarantee reasoning techniques is to come up with assumptions and guarantees that can be suitably used in the assumeguarantee rules. This is typically a difficult manual process. Progress has been made on automating assumeguarantee reasoning using learning and abstractionrefinement techniques for iterative building of the necessary assumptions [19]. The original work was done in the context of systems expressed as finitestate automata, but progress has been made in the automated compositional verification for probabilistic and hybrid systems [14, 2], which can be used to model autonomous systems.
Assumeguarantee reasoning can be used for the verification of autonomous systems either by replacing the component with its assumeguarantee specification in the compositional proofs or by using an assumeguarantee rule such as the above to decompose the verification of the systems into the verification of its components. Furthermore, the assumeguarantee specifications can be used to drive componentbased testing and runtime monitoring, in the cases where the designtime formal analysis is not possible, either because the components are too large or they are adaptive
, i.e. the component behavior changes at runtime (using e.g. reinforcement learning).
3 Analysis for Deep Neural Network Components
Deep neural networks (DNNs) are computing systems inspired by the biological neural networks that constitute animal brains. They consist of neurons (i.e. computational units) organized in many layers. These systems are capable of
learning various tasks from labeled exampleswithout requiring taskspecific programming. DNNs have achieved impressive results in computer vision, autonomous transport, speech recognition, social network filtering, bioinformatics and many other domains and there is increased interest in using them in safetycritical applications that require strong assurance guarantees. However, it is difficult to provide such guarantees since it is known that these networks can be easily fooled by adversarial perturbations: minimal changes to correctlyclassified inputs, that cause the network to misclassify them. For instance, in imagerecognition networks it is possible to add a small amount of noise (undetectable by the human eye) to an image and change how it is classified by the network.
This phenomenon represents a safety concern, but it is currently unclear how to measure a network’s robustness against it. To date, researchers have mostly focused on efficiently finding adversarial perturbations around select individual input points. The goal is to find an input as close as possible to a known input such that and are labeled differently. Finding the optimal solution for this problem is computationally difficult, and so various approximation approaches have been proposed. Some approaches are gradient based [20, 8, 7], whereas other use optimization techniques [3]. These approaches have successfully demonstrated the weakness of many stateoftheart networks; however, these approaches operate on individual input points, and it is unclear how to apply them to large input domains, unless one does a bruteforce enumeration of all input values which is infeasible for most practical purposes. Furthermore, because they are inherently incomplete, these techniques can not even provide any guarantees around the few selected individual points. Recent approaches tackle neural network verification [10, 13] by casting it as an SMT solving problem. Still, these techniques operate best when applied to individual points and further do not have a welldefined rationale to select meaningful regions around inputs within which the network is expected to behave consistently.
In [9], we developed a DNN analysis to automatically discover input regions that are likely to be robust to adversarial perturbations, i.e. to have the same true label, akin to finding likely invariants in program analysis. The technique takes inputs with known true labels from the training set and it iteratively applies a clustering algorithm [12] to obtain small groups of inputs that are close to each other (with respect to different distance metrics) and share the same true label. Each cluster defines a region in the input space (characterized by the centroid and radius of the cluster). Our hypothesis is that for regions formed from dense clusters, the DNN is welltrained and we expect that all the other inputs in the region (not just the training inputs) should have the same true label. We formulate this as a safety check and we verify it using offtheshelf solvers such as Reluplex [13]. If a region is found to be safe, we provide guarantees w.r.t all points within that region, not just for individual points as in previous techniques.
As the usual notion of safety might be too strong for many DNNs, we introduce the concept of targeted safety, analogous to targeted adversarial perturbations [20, 8, 7]. The verification checks targeted safety which, given a specific incorrect label, guarantees that no input in the region is mapped by the DNN to that label. Therefore, even if in that region the DNN is not completely robust against adversarial perturbations, we give guarantees that it is safe against specific targeted attacks.
As an example, consider a DNN used for perception in an autonomous car that classifies the images of a semaphore as red, green or yellow. We may want to guarantee that the DNN will never classify the image of a green light as a red light and vice versa but it may be tolerable to misclassify a green light as yellow, while still avoiding traffic violations.
The safe regions discovered by our technique enable characterizing the inputoutput behavior of the network over partitions of the input space, which can be encoded in the assumeguarantee specifications for the DNN components. The regions will define the conditions (assumptions), and the guarantees will be that all the points within the region will be assigned the same labels. The regions could be characterized as geometric shapes in Euclidean space with centroids and radii. The conditions would then be in terms of standard distance metric constraints on the input attributes. For instance, all inputs within a Euclidean distance from the centroid of the region would be labeled by the network.
Note that the verification of even simple neural networks is an NPcomplete problem and is very difficult in practice. Focusing on clusters means that verification can be applied to small input domains, making it more feasible and rendering the approach as a whole more scalable. Further, the verification of separate clusters can be done in parallel, increasing scalability even further.
In [9] we applied the technique on the MNIST dataset [16] and on a neural network implementation of a controller for the nextgeneration Airborne Collision Avoidance System for unmanned aircraft (ACAS Xu) [11], where we used Reluplex for the safety checks. For these networks, our approach identified multiple regions which were completely safe as well as some which were only safe for specific labels. It also discovered adversarial examples which were confirmed by domain experts. We discuss the ACAS Xu experiments in more detail below.
3.1 ACAS Xu case study
ACAS X is a family of collision avoidance systems for aircraft which is currently under development by the Federal Aviation Administration (FAA) [11]. ACAS Xu is the version for unmanned aircraft control. It is intended to be airborne and receive sensor information regarding the drone (the
ownship) and any nearby intruder drones, and then issue horizontal turning advisories aimed at preventing collisions. The input sensor data includes:

: distance from ownship to intruder;

: angle of intruder relative to ownship heading direction;

: heading angle of intruder relative to ownship heading direction;

: speed of ownship;

: speed of intruder;

: time until loss of vertical separation; and

: previous advisory.
The five possible output actions are as follows: ClearofConflict (COC), Weak Right, Weak Left, Strong Right, and Strong Left. Each advisory is assigned a score, with the lowest score corresponding to the best action. The FAA is currently exploring an implementation of ACAS Xu that uses an array of 45 deep neural networks. These networks were obtained by discretizing the two parameters, and , and so each network contains five input dimensions and treats and
as constants. Each network has 6 hidden layers and a total of 300 hidden ReLU activation nodes. We were supplied a set of cutpoints, representing valid important values for each dimension, by the domain experts
[11]. We generated a set of 2662704 inputs (cartesian product of the values for all the dimensions). The network was executed on these inputs and the output advisories (labels) were verified. These were considered as the inputs with known labels for our experiments.We were able to prove safety for 177 regions in total (125 regions where the network was completely safe against misclassification to any label and 52 regions where the network was safe against specific target labels). An example of the safety guarantee is as follows;
(1) 
Here {0.19,0.31,0.28,0.33,0.33} are the normalized values for the 5 input attributes (,,,,
) corresponding to the centroid of the region and 0.28 is the radius. The distance is in the Manhattan distance metric (L1). The contract states that under the condition that an input lies within 0.28 distance from the input vector {0.19,0.31,0.28,0.33,0.33}, the network is guaranteed to mark the action for it as COC which is the desired output.
Our analysis also discovered adversarial examples of interest, which were validated by the developers. Fig. 2 illustrates such an example for ACAS Xu.
The safety contracts obtained with the region analysis can be used in the compositional verification of the overall autonomous systems, which can be performed with standard model checkers.
4 Example
We illustrate our compositional approach on an example of an autonomous vehicle. The platform includes learning components that allow it to detect other vehicles and drive according to traffic regulations; the platform also includes reinforcement learning components to evolve and refine its behavior in order to learn how to avoid obstacles in a new environment.
We focus on a subsystem, namely an automatic emergency breaking system, illustrated in Figure 3. It has three components: the BreakingSystem, the Vehicle (which, to simplify the presentation, we assume it includes both the autonomous vehicle and the environment) and a perception module implemented with a DNN; there may be other sensors (radar, LIDAR, GPS) that we abstract away here for simplicity. The breaking system sends signals to the vehicle to regulate the acceleration and breaking, based on vehicle velocity, distance to obstacles and traffic signals. The velocity information is provided as a feedback from the plant, the distance information is obtained from sensors, while the information about traffic lights is obtained from the perception module. The perception module acts as a classifier over images captured with a camera. Such systems are already employed today in semiautonomous vehicles where adaptive cruise controllers or lane keeping assist systems rely on image classifiers providing input to the software controlling electrical and mechanical subsystems [6]. Suppose we want to check that the system satisfies the following safety property: the vehicle will not enter an intersection if the traffic light at the intersection turns red.
We write the system as the composition: . Each component has an interface that specifies its input and output variables (ports), and their parallel composition is formed by connecting components via ports. We write the property as follows (using Linear Temporal Logic, LTL, assuming discrete time): globally (G) if the semaphore (input image x) is red then eventually (F), within 3 seconds, the velocity becomes 0:
In practice, we would also need to encode in the assumption that the distance to traffic light is less than some threshold, but we simplify here to ease the presentation. We are thus interested in checking that the system satisfies property , written as . We decompose the system into two subsystems: and and define two assumeguarantee contracts and for the two subsystems. Suppose (part of) the contract for is:
The contract states that assuming the input (Class) to the subsystem is red then the vehicle is guaranteed to stop in at most 3 time units. We can further decompose the verification of into the separate verification of its components using additional contracts and perform componentwise verification. It remains to formally characterize the inputoutput behavior of the DNN in a contract that can be used in the compositional proofs. This is a difficult problem because DNN are known to be vulnerable to adversarial perturbations [20, 15]: a small perturbation added to an image that shows a red semaphore might lead the NN misclassifying it as having .
To address the problem, we use clustering over the training set (see Section 3) to automatically find regions where the network is likely to be robust to adversarial perturbations. The result is a finite set of welldefined regions, where a region is characterized by a pair ; is the centroid and is the radius of the region. We then use a verification tool (such as Reluplex) to check that, for all inputs within each region, the NN classifies them to the same label as that of known inputs (and of ):
The training data available and the amount of noise could impact the validity of the check. In such cases we may need to refine the contracts to include Bayesian estimates of uncertainty
[18]. Let denote the uncertainty in the output of the NN for an input . We can then refine the contract to check that the label is as expected and the uncertainty level is below a threshold. The DNN’s safety contract could then be the union of all the constraints of the form that are proved valid.We are now ready to perform the compositional proof: if and and furthermore , it follows that ; thus we prove that the whole system satisfies the property, without composing its (large) state space. This proof can be performed with standard model checkers.
4.1 Runtime Monitoring and Control
We note that the evidence we obtain from the analysis is conditional; we can only prove that the property holds for the region contracts that we found to be safe. The information encoded in the contract assumptions will need to be used to synthesize runtime guards that monitor inputs that fall outside the conditions and instruct the system to take appropriate, failsafe actions. Note also that this compositional approach enables separate verification of individual components: we can thus replace some of the verification tasks for individual components with testing or simulation, which will increase scalability but will give only empirical guarantees.
Furthermore, if the system contains adaptive components, the verification of those components can be done at runtime, whereas the static components only need to be checked once, at design time. Adaptive learningenabled components pose additional challenges over time. We can again use model uncertainty to identify situations in which the adaptrive learningenabled system is not confident about its decisions, and take appropriate actions in such cases.
5 Conclusion
We presented a compositional approach for the verification of autonomous systems. The approach uses assumeguarantee reasoning for scalable verification and can naturally integrate reasoning about the learningenabled components in the system. We are working on evaluating the proposed approach on various simulation and real autonomous platforms, including selfdriving cars (discussed briefly in Section 4), autonomous quadcopters and airplanes. These case studies cover perception, decision making, control and actuation of autonomous systems, and they include safetycritical cyberphysical components as well as DNN components.
References
 [1] S. Bak and S. Chaki. Verifying cyberphysical systems by combining software model checking with hybrid systems reachability. In 2016 International Conference on Embedded Software, EMSOFT 2016, Pittsburgh, Pennsylvania, USA, October 17, 2016, pages 10:1–10:10, 2016.
 [2] S. Bogomolov, G. Frehse, M. Greitschus, R. Grosu, C. S. Pasareanu, A. Podelski, and T. Strump. Assumeguarantee abstraction refinement meets hybrid systems. In Hardware and Software: Verification and Testing  10th International Haifa Verification Conference, HVC 2014, Haifa, Israel, November 1820, 2014. Proceedings, pages 116–131, 2014.
 [3] N. Carlini and D. Wagner. Towards evaluating the robustness of neural networks. In Proc. 38th IEEE Symposium on Security and Privacy, 2017.
 [4] C. Chilton, B. Jonsson, and M. Z. Kwiatkowska. An algebraic theory of interface automata. Theor. Comput. Sci., 549:146–174, 2014.
 [5] C. Chilton, B. Jonsson, and M. Z. Kwiatkowska. Compositional assumeguarantee reasoning for input/output component theories. Sci. Comput. Program., 91:115–137, 2014.
 [6] T. Dreossi, A. Donzé, and S. A. Seshia. Compositional falsification of cyberphysical systems with machine learning components. In NASA Formal Methods  9th International Symposium, NFM 2017, Moffett Field, CA, USA, May 1618, 2017, Proceedings, pages 357–372, 2017.
 [7] R. Feinman, R. R. Curtin, S. Shintre, and A. B. Gardner. Detecting adversarial samples from artifacts, 2017. Technical Report. http://arxiv.org/abs/1703.00410.
 [8] I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples, 2014. Technical Report. http://arxiv.org/abs/1412.6572.
 [9] D. Gopinath, G. Katz, C. S. Pasareanu, and C. Barrett. Deepsafe: A datadriven approach for checking adversarial robustness in neural networks. 2017.
 [10] X. Huang, M. Kwiatkowska, S. Wang, and M. Wu. Safety verification of deep neural networks. In Proc. 29th Int. Conf. on Computer Aided Verification (CAV), pages 3–29, 2017.
 [11] K. Julian, J. Lopez, J. Brush, M. Owen, and M. Kochenderfer. Policy compression for aircraft collision avoidance systems. In Digital Avionics Systems Conf. (DASC), pages 1–10, 2016.

[12]
T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, and A. Y.
Wu.
An efficient kmeans clustering algorithm: Analysis and implementation.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7):881–892, 2002.  [13] G. Katz, C. Barrett, D. Dill, K. Julian, and M. Kochenderfer. Reluplex: An efficient SMT solver for verifying deep neural networks. In Proc. 29th Int. Conf. on Computer Aided Verification (CAV), pages 97–117, 2017.
 [14] A. Komuravelli, C. S. Pasareanu, and E. M. Clarke. Assumeguarantee abstraction refinement for probabilistic systems. In Computer Aided Verification  24th International Conference, CAV 2012, Berkeley, CA, USA, July 713, 2012 Proceedings, pages 310–326, 2012.
 [15] A. Kurakin, I. Goodfellow, and S. Bengio. Adversarial Examples in the Physical World, 2016. Technical Report. http://arxiv.org/abs/1607.02533.

[16]
Y. LeCun, C. Cortes, and C. J. C. Burges.
The MNIST database of handwritten digits.
http://yann.lecun.com/exdb/mnist/.  [17] J. Li, P. Nuzzo, A. L. SangiovanniVincentelli, Y. Xi, and D. Li. Stochastic assumeguarantee contracts for cyberphysical system design under probabilistic requirements. CoRR, abs/1705.09316, 2017.
 [18] Y. Li and Y. Gal. Dropout inference in bayesian neural networks with alphadivergences. In ICML, pages 2052–2061, 2017.
 [19] C. S. Pasareanu, D. Giannakopoulou, M. G. Bobaru, J. M. Cobleigh, and H. Barringer. Learning to divide and conquer: applying the l* algorithm to automate assumeguarantee reasoning. Formal Methods in System Design, 32(3):175–205, 2008.
 [20] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks, 2013. Technical Report. http://arxiv.org/abs/1312.6199.
Comments
There are no comments yet.