1 Introduction
Reachability analysis has been a central topic and the key for the verification of safetycritical autonomous systems. The majority of existing reachability approaches either compute worstcase reachable sets [chen2018hamilton, devonport2020data, chen2013flow, tran2020nnv, bansal2020deepreach, liu2019algorithms], or use Monte Carlo simulation to estimate the reachable states with probabilistic guarantees [devonport2020estimating, liebenwein2018sampling, lew2020sampling, devonport2020data]. There are several obvious disadvantages of such methods. 1). Worstcase methods, especially those for nonlinear systems, often overapproximate the reachable sets and produce very conservative results (due to convex set representations [kurzhanski2000ellipsoidal, girard2005reachability, duggirala2016parsimonious, meyer2019tira], wrap effects [bak2014reducing, althoff2008reachability], loworder approximation [chen2013flow, cyranka2017lagrangian]
, etc.), not to say they are usually very computationally expensive (also known as the curse of dimensionality). 2). Existing methods do not care about reachable state concentration and produce the same reachable state estimations for different distributions of initial states if those distributions share the same support
^{1}^{1}1The probabilistic guarantees of the samplingbased methods do rely on the form of the initial states distribution. However, the final reachable sets estimate is the same for different distributions with the same support.. That is, current approaches do not compute which states are more likely to be reached. For example, in Fig. 1, the state density distribution of a Van der Pol oscillator evolves over time and concentrates on certain states (the highlighted part in the figure), even if the initial states are uniformly distributed.
If one uses an existing worstcase reachability algorithm, most likely the results in Fig. 1 (b)(c) will show almost the entire black region inside the highlighted part is reachable, as those methods often use convex sets to represent the reachable sets.In this paper, we aim to tackle the above problems and propose a learningbased method that can compute the density distribution of the reachable states from given initial distributions. The state density is much more powerful than worstcase reachability and can better quantify risks. Our proposed method is based on the Liouville theorem [ehrendorfer1994liouville, nakamura2019density, chen2020optimal], which is from classical Hamiltonian mechanics and asserts that the state density distribution function (which is the measurement of state concentration) is constant along the trajectories of the system. Given an autonomous system that is locally Lipschitz continuous, the evolution of the state density (i.e. the density of state at time ) is characterized by a Liouville partial differential equation (PDE). We learn the state density as a neural network (NN) while respecting the laws of physics described by the nonlinear Liouville PDE, in a way that is similar to the physicsinformed NN [raissi2019physics].
Furthermore, we make two major improvements to make the learned density NN suitable for verification. 1). Instead of learning for a fixed initial distribution , we learn the density concentration function which specifies the multiplicative change of density from any . 2). We use a single NN to jointly learn both the reachable state and its corresponding density. Moreover, we use Reachable Polyhedral Marching (RPM) [vincent2020reachable]
—an exact ReLU NN reachability tool—to parse our learned NN as linear mappings from input polyhedrons to output polyhedrons. Using such parsed polyhedrons, we can perform online forward and backward reachability analysis and get the range of density bounds
for each output polyhedron. Together, our method can perform online safety verification by computing the probability of safety (instead of a single yes or no answer) under various initial conditions, even with unknown system dynamics (i.e. a blackbox system where we only have access to a simulator of ). In this way, it also has the potential to collect environmental data in runtime and update its distribution for online safety verification.We conduct experiments on 10 different benchmarks covering systems from low dimension academic examples to high dimension blackbox simulators equipped with either handcrafted or NN controllers. Surprisingly, without using any groundtruth density data in the learning process, our approach can achieve up to 99.89% of error reduction in KL divergence with respect to the groundtruth value,
when compared to samplingbased methods like kernel density estimation and Gaussian Processes
. Moreover, our learned density concentration function can also be used for reachability distribution analysis. We show that in several systems, more than 90% of the states can actually just reside in a small region (less than 10% of the volume of the convex hull for those states) in the state space, which also points out the conservativeness of the worstcase reachability analysis methods in terms of quantifying risks. We also show that different initial distributions can lead to a drastic change in the safety probability, which can help in cases when unsafe is inevitable but can be designed to happen with a very low probability.Our major contributions are: (1)
we are the first to provide an explicit probability density function for reachable states of dynamical systems characterized as ordinary differential equations; this density function can be used for online safety analysis
, (2) we propose the first datadriven method to learn the state density evolution and give accurate state density estimation for arbitrary (bounded support) initial density conditions without retraining or finetuning, and (3) we use a variety of examples to show the necessity to perform reachability distribution analysis instead of pure worstcase reachability, to flexibly and less conservatively quantify safety risks.Related work
Reachability analysis, especially worstcase reachability using samplingbased methods or set propagation, has been studied for decades. The literature on reachability has been extensively studied in many surveys [chen2018hamilton, liu2019algorithms, arch, agha2018survey]. Here we only discuss a few closely related works.
Hamilton Jacobian PDE has been used to derive the exact reachable sets in [mitchell2005time, chen2018hamilton, bansal2020deepreach]. However, the HJPDE does not provide the density information. Many datadriven approaches can compute probabilistic reachable sets using scenario optimization [devonport2020estimating, xue2020pac], convex shapes [liebenwein2018sampling, lew2020sampling, berndt2021data]
[allen2014machine, rasmussen2017approximation], kernel embedding [thorpe2020data][chakrabarty2020active], and Gaussian process [devonport2020data]. However, the probabilistic guarantees they provide are usually in the form that with enough samples, instead of the state density distribution. [fridovich2020confidence] estimates human state distribution using a probabilistic model but requires state discretization. [majumdar2014convex] uses the Liouville equation to maximize the backward reachable set for only polynomial system dynamics. In [abate2007probabilistic]the authors compute the stochastic reachability by discretizing stochastic hybrid systems to Markov Chains (MC), then perform probabilistic analysis on the discretized MC. The closedform expression of the probability requires integrals over the whole state space hence is computationheavy and cannot be used for online safety check.
The closest to ours is [matavalam2020data]where PerronFrobenius and Koopman operators are learned from samples of trajectories. Then, the learned operators can be used to transform the moments of the distribution over time. The distribution at time
is then recovered from the transformed moments. However, as they use moments up to a specific order to represent a distribution, even if the learned operators are perfect, the estimation error of the distribution might not be zero. Also, such a momentbased method is hard to scale to largedimensional systems. Recently, there is also a growing interest in studying the (worstcase) reachability of NN [katz2017reluplex, katz2019marabou, xiang2018output, yang2020reachability, vincent2020reachable] or systems with NN controllers [ivanov2019verisig, tran2020nnv, fan2020reachnn, hu2020reach, everett2021efficient]. In this paper, we use [vincent2020reachable] to parse our learned NN as a set of linear mappings between polyhedrons for online reachability computation, but this step can be replaced with other NN reachability tools.To measure the probability distribution in the reachable sets, the most naïve approach is to use histograms or kernel density estimation
[chen2020optimal], similar to the MonteCarlo method used in dispersion analysis [spencer1996mars]. But this could lead to poor accuracy and computational scalability [niederreiter1992random]. Another approach propagates the uncertainty by approximating the solution of the probability density function (PDF) transport equation [pantano2007least], which could still be timeconsuming due to the optimization process performed at each time step. Our approach finds the PDF transport equation by solving the Liouville PDE ^{2}^{2}2In the absence of process noise, this PDF transport equation reduces to stochastic Liouville equation [ehrendorfer1994liouville] using NN. Similar ideas have been explored in [nakamura2019density]. However, the density NN in [nakamura2019density] was learned solely for a fixed initial states distribution and therefore, cannot be used for online prediction. Instead, we jointly learn the reachable states and density changes, and can perform online reachability density computation for any initial states distribution.The idea of using deep learning to solve PDEs can be traced back to the 1990s
[lee1990neural, lagaris1998artificial, uchiyama1993solving], where the solutions of the PDE on a priori fixed mesh are approximated by NN. Recently, there is a growing interest in the related subfields including: meshfree methods in strong form [raissi2019physics, sirignano2018dgm, berg2018unified] and weak form [weinan2018deep, he2018relu], solving highdimension PDEs via BSDE method [han2020algorithms], solving parametric PDE [khoo2021solving, kutyniok2021theoretical] and stochastic differential equations [zhang2019quantifying, yang2018physics], learning differential operators[li2020fourier, long2018pde, lu2019deeponet], and developing more advanced toolboxes [koryagin2019pydens, lu2021deepxde, chen2020neurodiffeq]. Our idea for solving Liouville PDE along trajectories is similar to [han2020algorithms] but without the stochastic term.2 Preliminaries
We denote by , , , the sets of real numbers, nonnegative real numbers,
dimensional real vectors, and
real matrices. We consider autonomous dynamical systems of the form , where for all , is the state and is a compact set that represents the state space. We assume that is locally Lipschitz continuous. The solution of the above differential equation exists and is unique for a given initial condition . We define the flow map such that (also written as for brevity) is the state at time starting from at time . Note that system parameters can be easily incorporated as additional state variables with time derivative to be .We analyze the evolution of the dynamical system by equipping it with a density function , which measures how states distribute in the state space at a specific time instant. A larger density means the state is more likely to reside around at time , and vice versa. The density function is completely determined by the underlying dynamics (i.e., function ) and the initial density map . Specifically, given a , the density function solves the following boundary value problem of the Liouville PDE [ehrendorfer1994liouville]:
(1) 
where is the divergence for the vector field , and takes the th coordinate of a vector.^{3}^{3}3A general form of Liouville PDE can have a nonzero term on the righthand side indicating how many (new) states appear or exit from the system during the run time [chen2020optimal]. In all the systems we discuss here, there is no state entering (other than the initial states) or leaving the system, so the righthand side of Eq. (1) is zero. The total density of the systems we consider is invariant over time. Intuitively, as shown in [ehrendorfer1994liouville], Liouville PDE is analogous to the mass conservation in fluid mechanics, where the change of density at one point is balanced by the total flux traversing the surface of a small volume surrounding that point. It is hard to solve the Liouville PDE for a closed from of the density function . However, it is relatively easy to evaluate the density along a trajectory of the system. To do that, we first convert the PDE into an ODE as follows. Considering a trajectory , the density along this trajectory is an univariate function of , i.e., . If we consider the augmented system with states , from Eq. (1) we can easily get the dynamics of the augmented system [chen2020optimal]:
(2) 
Therefore, to compute the density at an arbitrary point at time , one can simply proceed as follows: First, find the initial state using the inverse dynamics . Then, solve the Eq. (2) with initial condition . The solution at time just gives the desired density value. However, such a procedure only gives the density at a single point, therefore, cannot be used in reachability analysis. Instead, we need to compute the solution of Eq. (2
) for a set of initial conditions and use that to compute the reachable sets. To achieve this, we use an NN with ReLU (Rectified Linear Unit) activation functions
[nair2010rectified] to jointly approximate the flow map and the density concentration function, as will be shown in the next section.3 Density learning and online reachability density computation
Let us take a closer look at Eq. (2). For an initial condition , the closed form of the solution is Interestingly, the solution of is linear in the initial condition . The gained part denoted by is a function of the initial state and time , but is independent of . This mapping is completely determined by the underlying dynamics and we call it the density concentration function. With in hand, the density at an arbitrary time and state can be quickly computed from any . However, is obviously hard to compute. Therefore, we use NN to approximate the density concentration function. In addition to , we also use NN to learn the flow map , which will be a necessity for computing the reachable set distribution shown in Sec. 3.2.
3.1 System dynamics and density learning framework
Let the parameterized versions of the flow map and the density concentration function be and respectively, where and are parameters. To train the neural network, we construct a dataset by randomly sampling trajectories of the system: in time steps (with time interval ): where . Then, the goal of the learning is to find parameters and satisfying
(3) 
where the first constraint is for the flow map estimation, and the second constraint enforces the Liouville equation for all the data points.
As for the implementation, we model and jointly as a fullyconnected neural network with ReLU activations. To ensure numerical stability as is an exponential function, we add a nonlinear transform from the NN output to the density concentration function:
(4) 
where is to choose the th dimensions from the output of the NN, and is the intermediate density estimation from the NN. In this way, we guarantee that the density concentration function at is always 1. We optimize our NN via back propagation
with the loss function:
(5) 
where the first term denotes the state estimator square error [chai2014root], the second term indicates how far (in the sense of L2norm) the solution deviates from the Liouville Equation, and balances these two loss terms. We approximate the time derivative of the density concentration function by . Our method can also work for blackbox systems if we approximate
numerically. With some tools from statistical learning theory, we can show that with a large enough number
of samples, the learned flow map and the density concentration function can be arbitrarily accurate. A formal proof is shown in Appendix A.3.2 System reachable set distribution computation via NN Reachability Analaysis
In the last section, we have learned an NN that can estimate the density at a single point given the initial density. In this section, we will boost this singlepoint estimation to setbased estimation by analyzing the reachability of the learned NN. To compute the set of all reachable states and the corresponding density from a set of initial conditions, we use the Reachable Polyhedron Marching (RPM) method [vincent2020reachable] to further process the learned NN for and . RPM is a polyhedronbased approach for exact reachability analysis for ReLU NNs. It partitions the input space into polyhedral cells so that in each cell the ReLU activation map does not change and the NN becomes a fixed affine mapping. With the cells and the corresponding affine mapping on each cell, the exact reachable set for a set of input can be quickly evaluated. Backward reachability analysis can also be performed by computing the intersection of the preimage of a query output set with the input polyhedron cells.
System forward reachable set with density. Recall that the input of the NN in Sec. 3.1 consists of the initial state and time . For the simplicity of comparison with other methods, we only estimate the reachable set and the density at given fixed time instances and fix the last element of the input of the NN. Thus for each time step t, the input polyhedral cells generated from the RPM will be a set of linear inequality constraints on , and those input cells together with the set of the affine mappings and output polyhedral cells can be represented as: , where in each input cell , the NN becomes an affine mapping , and thus the image of the input cell is also a polyhedron .
Recall that the first dimension of our NN output estimates the density concentration function , and the rest dimensions estimate the state , thus the output cell can be written as . By projecting it to the state space, we get the reachable set of the system, i.e., . Then, in each cell , we evaluate the lower and upper bounds of , and denote them by and . The density bound for cell is then computed as
(6) 
where is the center of . Finally the system forward reachable set is a union of projected output polyhedral cells: where each cell is associated with a density bound .
System reachable set probability computation. Given an initial state distribution, we want to figure out the probability distribution of the system forward reachable sets, as well as the probability for the states land into a query set (e.g., the query set could be the unsafe region).
For an arbitrary initial probability density function (whose support is bounded), we can apply RPM to partition its support into cells^{4}^{4}4We can always further divide those input cells to make the bound tighter/ more precise, while still guaranteeing the Neural Network on each cell can be seen as an affine transformation. as in the last section. Finally, we obtain the reachable sets with bounded state densities and the probability bound in each cell is:
(7) 
where computes the volume for a polyhedron. By computing for all input cells , we can derive the system forward reachable set and the corresponding probability bound as . The backward reachable set probability can be computed in a similar fashion, by checking the intersection between the query output region and the output cells derived by RPM, computing each intersection’s probability range by its volume and density bound and finally aggregating the probability of all intersections. Detailed computation is shown in Appendix B. ^{5}^{5}5The RPM method cannot handle systems with higher than 4 dimensional state space in our experiments. But the technique discussed in Sec. 3.2 can work with any setbased NN reachability tools with slight modification based on the set presentation of the tool. Developing a better NN reachability tool that can scale to higher dimensional systems is another topic and is out of the scope of our paper.
4 Experimental evaluation in simulation
Here we show the benefits of learning density concentration function from Liouville PDE in state density and reachable set distribution estimation. More experiments are provided in Appendix CH.
4.1 Implementation details
Datasets: We investigate 10 benchmark dynamical systems as reported in Table 1. These benchmark systems range from low dimension academic models (23 dimensions) to complex and even blackbox systems (1316 dimensions) controlled by handcrafted or NN controllers. All controllers (except for the ground robot navigation example) are from the original references. More details (the model description and initial distributions) will be provided in Appendix C.
Training: For each system, we generate 10k trajectories through simulation with varied trajectory lengths from 10 to 100 timesteps, depending on different configurations of the simulation environment. We use 80% of the samples to train the NN as described in Sec. 3.1
and use the rest for validation. For the trajectory data, we collect the system states and compute for the system divergence term. For blackbox systems, we use the gradient perturbation method to approximate the derivatives. We use feedforward NN which has 3 hidden layers with 64 hidden units in each layer. We use PyTorch
[paszke2019pytorch] to train the NN and the training takes 12 hours on an RTX2080 Ti GPU.4.2 Density estimation verification
We first test the density estimation accuracy of our learned NN. We compare our approach with other baselines including kernel density estimation (KDE), Sigmoidal Gaussian Process Density (SGPD) [donner2018efficient] and the histogram approach. For each simulation scenario, we first solve Eq. (2) to generate 20k 100k trajectories of (state, density) pairs, and treat this density value as the ground truth. For the KDE method, we choose an Epanechnikov kernel. We then measure the KL divergence between the density estimate of each method and the ground truth. As shown in Fig. 2, our approach has consistently outperformed KDE, SGPD and histogram approaches, with the largest reduction of 99.69% in KL divergence when compared with the histogram approach for the KraichnanOrszag system, while our method doesn’t use any ODE generated density values during training. Also in highdimension systems (dimension ), the histogram approach fails to predict the density due to the curse of the dimensionality, whereas our approach can always predict the density, with a 30.13% to 99.87% decrease in KL divergence comparing to KDE. More plots will be given in Appendix D.
4.3 Reachable set distribution analysis


Forward reachable set distribution analysis.
Being confident that our approach is able to provide an accurate state density estimation, we extend our NN to do distribution analysis, which is a valuable technique in safetyrelated applications like autonomous driving. Here we use an existing reachability tool RPM [vincent2020reachable] to compute the forward reachable sets with probability bounds. Details about how to derive the density and probability bound for the reachable sets are presented in Sec. 3.2 and in Appendix B. Also, RPM was only able to parse the NN for Van der Pol, Double integrator, ground robot navigation, and the FACTEST car model. It fails in handling other highdimension complex systems due to numerical issues when partitioning for the input set. Thus, we only report the results on those 4 models in this section. The main purpose is to show that for some systems the density tends to concentrate on certain states, where a small portion of the reachable sets contains the majority of states that are more likely to be reached. Therefore, our method can better quantify risks than worstcase reachability, by providing a flexible threshold for the probability of reachability.
We start with the Van der Pol oscillator. The initial states are uniformly sampled from a square region: . As illustrated in Fig. 1, the system states will gradually converge to a limit cycle. Using worstcase reachability analysis for this system will result in a very conservative overapproximation, and this overapproximation will propagate over time and lead to increasing conservativeness of the reachable set estimation. As shown in Fig. 3(a), for the worstcase methods like DryVR(red) [fan2020fast, du2020online] and GSG(Green) [everett2020robustness], the volume of their estimated reachable set relative to the volume of the convex hull of the system states keeps increasing over time, from 1.9670X to 5.3027X for the DryVR [fan2020fast] approach, and from 1.7636X to 2.8086X for GSG [everett2020robustness]. However, our method can give the probability bound for every reachable set in the state space as shown in the heatmap in Fig. 3(b), clearly identifying the region around the limit cycle in high density (bright color), and the rest space in low density (dark color).
We can also compute the relative volume of the reachable sets (comparing to the convex hull) preserving different levels of reachable probability, whose evolution over time reflects the system’s tendency for concentration. As shown in Fig. 4, we use the above 4 systems and study the volume of the reachable set with probability threshold 0.50, 0.70, 0.80, 0.90, and 0.99. As expected, the relative volume will increase as the probability threshold increases. In all cases, there exist some time instances where a small volume of the reachable set actually preserves high probability, which shows the state concentration exists in many existing systems. While as shown in Fig. 4, the worstcase reachability tools can only generate one curve which presents the (relative) volume of the reachable set that covers all possible states. Not surprisingly, the worstcase reachability tools give very conservative results. We believe using our proposed method to do reachable set distribution analysis will benefit future study for systems with uncertainty, and for systems where the failure case is inevitable but happens with a low probability. More comparisons with stateoftheart reachability methods (Verisig [ivanov2019verisig], Sherlock [dutta2017output] and ReachNN [fan2020reachnn]) are shown in Appendix H.
Online safety verification under different initial state distributions.
Since our approach learns the density concentration function instead of absolute density value, it has the flexibility to estimate online reach sets distribution with any possible (bounded support) initial distributions. Consider the safety verification for the ground robot navigation problem (shown in Fig. 5(a)): The probability of colliding with a obstacle in the center of the map is determined by the initial state distribution ^{6}^{6}6How to compute the probability of a reachable set is discussed in Sec.3.2
, which is parametrized as a truncated Gaussian distribution
where andmeasure the expectation and uncertainty of the initial robot state. Our method can estimate the upper and lower bounds for the probability of colliding with the obstacle, and this safety evaluation process can run in faster than 50Hz with parallel computation and heuristic searching used (see details in
Appendix E). As shown in Fig. 5(b), when the initial state uncertainty decreases from to , the upper and lower bounds for the probability of collision decrease to close to zero (from to , where other worstcase reachability analysis methods can only report a collision is inevitable, without quantifying the corresponding risks. This also shows the advantage of our approach in adapting to different density conditions in computation without retraining or finetuning.Limitations and tradeoffs of performing reachable set distribution analysis.
Experimental results show that our method can compute much less conservative probabilistic reachable sets than most worstcase reachability methods. This less conservative result benefits from RPM which can provide exact NN reachability analysis by sacrificing scalability. Therefore RPM also constrains us from performing online reachability analysis for highdimensional systems or systems with large initial sets. Technically, we are solving a harder problem than worstcase reachability approximation, as we need not only the reachable set, but also the density over those reachable states. Like worstcase analysis, this is the reason why it can only scale to lowerdimensional systems and smaller initial sets when we want to perform accurate reachable computation. We believe that our approach can better quantify risks under different conditions, especially when unsafe is inevitable (similar to Fig. 5(a)). Our method can also give worstcase reachability by taking all output reachable cells produced by RPM regardless of their density. This worstcase reachability using RPM is less conservative than other NN reachability tools, at the cost of not scaling to highdimensional systems.
5 Conclusion and discussion
In this paper, we propose a Neural Network (NN)based probabilistic safety verification framework that can estimate state density, compute reachable sets and corresponding probability. Our Liouvillebased NN can accurately estimate the state density even for highdimension systems. Our probabilistic reachable set framework can handle nonlinear (and potentially blackbox) systems with varying initial state distributions and can be used for fast online safety verification. We recognize that the task of computing probabilistic reachable sets is very useful, and our method is more helpful than worstcase reachability particularly when the system states are more likely to concentrate. One limitation of our approach is that the NN reachability tool we used (RPM) cannot handle highdimension systems or cases where the initial set is very large, due to the numerical issues when partitioning for polyhedral cells. This limitation is due to the scalability and accuracy tradeoff of NN reachability, which is an independent problem from our paper. We plan to explore other NN reachability methods and more complicated hybrid systems in realworld applications. The NASA University Leadership initiative (grant #80NSSC20M0163) and Ford Motor Company provided funds to assist the authors with their research, but this article solely reflects the opinions and conclusions of its authors and not any NASA or Ford entity.
References
A Generalization error bound for the learning framework
With sufficient amount of data and a large enough neural network, we can approximate the state and density estimation at arbitrary small errors [leshno1993multilayer]. In the language of statistical learning theory, the neural network generating functions is called a hypothesis and denoted by . The set containing all the possible hypotheses is called the hypothesis class . For a hypothesis generating , we denote where is an error tolerance term which is further used to derive the probabilistic guarantee. Assume that the optimization problem in Eq.(3) is feasible, and and solve Eq. (3). Let be the hypothesis that generates . Furthermore, assume that , and denote the sample distribution (where the training sample trajectories are sampled from). Then according to Theorem 5 in [srebro2010smoothness], the following statement holds with probability at least over a training data set consisting of i.i.d. random trajectories:
(8) 
where is a universal constant, and is the Rademacher complexity for defined as:
(9) 
where
are i.i.d. random variables with
.Remarks: Here we reduce bounding the generalization error to bounding the Rademacher complexity , where can be further bounded as for Lipschitz parametric function classes (including neural networks) where denotes the number of learnable parameters [boffi2020learning][Theorem 4.2.]. In this way, we show that for a fixed error threshold , as the number of training samples increases, the probability that our learning framework fails to satisfy the Liouville equation or fails to estimate the system dynamics will gradually decrease to zero. We show an empirical result to support this in Figure 1. For the Van der Pol Oscillator benchmark example, we train the neural network with different numbers of training samples (from ) and report the testing error (mean square error for the state estimation and density concentration function comparing to the groundtruth) for a fixed testing set. As the number of training samples increases, the testing error gradually converges to zero.
Assume the functions on the right hand side of Eq. (2) are uniformly Lipschitz continuous in , then the function will have a unique solution according to PicardLindelöf theorem[coddington1955theory][Theorem I.3.1]. Then if our estimator satisfies the Liouville equation everywhere, we can recover the groundtruth density concentration function as well as the system dynamics.
B Implementation details for system reachable set probability computation using RPM
b.1 Online query set probability bound computation under different initial state distributions
The problem formulation is: given a query set with density concentration function constraints (the range that the density concentration function can change from the initial condition to the terminal condition; if this constraint is not specified, the default value is ), compute the probability that the system will reach this query set (with optional density constraints).
In our case, when using RPM to compute the reachable sets, we represent as a polyhedron, and since is a set of linear inequality constraints, the set is also a polyhedron. At each time step , from Sec. 3.2 we can represent the NN input cells, affine mapping and output cells at this time step as (here we omit the subscript for for the brevity in the notation) where each input cell is a polyhedron , with an affine mapping and the resulting output cell is also polyhedron . Then for each output cell , we check for the intersection between the query cell and the output cell . Next we can derive the intermediate density concentration function bound on
by solving the following Linear Programming problem (here taking
as an example; to solve we just need to change the “min” to “max” in the objective function in Eq. 10; and here denotes the first coordinate of , thus as we denote ):(10) 
After we derive the bound for on , the density bound for is computed as (similar to Eq.(6)):
(11) 
where is the center of . And the probability bound can be computed by and where the is the volume for the intersection. Finally the probability of the system reach this query set at time is bounded by . An illustrative figure is shown in Fig. 2
Remarks: This algorithm can be used for online safety verification under different initial state distributions by just representing the dangerous set in , and changing the function in (11) on the fly. Here we approximate the density distribution in using the density evaluated at which is the center of  the accuracy of this approximation will converge to 1 as the partition on gets finer.
b.2 Backward reachable set probability computation
The problem formulation is: given a query set with density concentration function constraints (the range that the density concentration function can change from the initial condition to the terminal condition; if this constraint is not specified, the default value is ), compute for all possible initial conditions as well as probabilities that lead the system to reach the query set (with optional density constraints).
Similar to Sec. B.1, we can denote this query set as . At each time step , the NN input cells, affine mapping and output cells are where each input cell is a polyhedron , with an affine mapping and the resulting output cell is also polyhedron . Then for each output cell , we check for the intersection between the query cell and the output cell . Using the affine mapping with invertible ^{7}^{7}7In practice, is in high probability to be invertible. This is because the set of all noninvertible random matrices forms a hypersurface with Lebesque measure zero. When is singular, we can use elimination method like FourierMotzkin elimination as in [vincent2020reachable] to derive the set representation in the input side., we can derive the preimage of this intersection to be . Thus the reachable set can be computed using projection: and the corresponding probability is where is the initial state distribution function and is the center of . By performing this for all output cells and for all time steps , we derive the backward reachable set .
b.3 Speed up the probability computation by using hyperrectangle heuristic
The computation in both Sec. B.1 and Sec. B.2 requires checking the intersection between polyhedral and , where one approach is to check whether a feasible solution exists for the linear programming problem : . Solving this for requires time when the interior method is used. To speed up the intersection checking process, we introduce a hyperrectangle heuristic: at the preprocessing stage, we overapproximate each polyhedron by its outer hyperrectangle (derived by computing the range for the vertices of in each dimension). When checking for the polyhedron intersection between and , we first check whether their corresponding hyperrectangles and will intersect. If and do not intersect, then it is guaranteed that the polyhedra and won’t intersect. Otherwise, we further check the intersection of and by using the interior method. Checking hyperrectangles’ intersection can be implemented in , hence greatly accelerates the computation process. A detailed computation time comparison will be presented in Sec. E.
C Simulation environments
In this section, we present the implementation details for all 10 simulation environments used in our main paper, sorted in the same order as shown in Table. 2.
c.1 Van der Pol Oscillator
Consider the Van der Pol Oscillator problem: where the position variable is a function of and the scalar parameter indicates the strength of the system damping effect. By doing a transformation: , the original problem can be shaped to the following 2d system dynamics:
(12) 
where the divergence term used in (2) can be computed as: . In the simulation, we set , the initial state distribution as an uniform distribution and the time step duration . We run each simulation for 50 time steps to collect the trajectories.
c.2 Double Integrator with an NN controller
We consider a discrete double integrator system introduced in [hu2020reach]:
(13) 
where denotes the 2d state variable, and is the output of a neural network controller which is trained to mimic the behavior of an MPC controller [hu2020reach, everett2021efficient]. We convert the system to the continuous system with state and time step duration as :
(14) 
and here the divergence term used in (2) can be computed as: , where the is the gradient of the neural network controller output with respect to the input (and similar for and ) and can be calculated using automatic differentiation engine in PyTorch [paszke2019pytorch]. We set the initial state distribution as an uniform distribution . Similar to [hu2020reach, everett2021efficient], we run each simulation for 10 time steps to collect the trajectories.
c.3 KraichnanOrszag system
The system dynamics of the KraichnanOrszag problem [orszag1967dynamical, nakamura2019density] is defined as:
(15) 
and here an interesting fact is that the divergence term used in (2) is just: , which means the density along each trajectory won’t change over time, and only depends on the initial state distribution. Similar to [nakamura2019density], we set the initial state distribution as an Gaussian distribution with:
(16) 
where we further truncate the initial state within the range . We set the time step duration and run each simulation for 80 time steps to collect the trajectories.
c.4 Inverted pendulum
The inverted pendulum problem [chang2020neural] is defined as , where denotes the pendulum’s relative angle to the the upright position, are predefined parameters and denotes the output of an LQR controller [chang2020neural] where and are scalarvalued coefficients. To test for the system performance under different coefficient settings for the LQR controller, we include , into the system state variable and study the following system dynamics:
(17) 
where . Now the divergence term used in (2) can be computed as: . Based on [chang2020neural], we set , the time step duration . We set the initial state distribution as a uniform distribution and run each simulation for 50 time steps to collect the trajectories.
c.5 Ground robot navigation with an NN controller
We design a ground robot navigation experiment (as shown in Fig. 3), where the objective is to reach the green region while avoiding to enter the red region . The robot is following an Dubins car model:
(18) 
where represent robot’s x and y position, heading angle and velocity respectively. We use an NN controller to output control signals . The NN controller is a feedforward NN with 2 hidden layers and 32 hidden units in each layer. We use ReLU for the intermediate activation functions and use Tanh as the activation function for the last layer to make sure the control output is always bounded. During training, we use this NN controller to collect trajectory data and do backpropagation with the loss function: where . Here the divergence term used in (2) can be computed as: . We set the initial state distribution as an uniform distribution . We run each simulation for 50 time steps with time duration to collect the trajectories.
c.6 FACTEST car tracking system
Consider a rearwheel kinematic car in 2D scenarios where the dynamics is:
(19) 
and the corresponding errors are measured by:
(20) 
with being some predefined tracking points (in this experiment, we assume the tracking points are not changing over time). With the following tracking controller defined in ( and are referenced angular velocity and velocity respectively, are the parameters controlling how fast the system will converge to the reference point) [fan2020fast]:
(21) 
and with an uncertainty error in the dynamics of and (denoted as ), the error dynamics become:
(22) 
The uncertain parameter . We will show that althought now the reachable set will be much larger than the case when , the probability that the system does not converge to the origin (zeroerror) is very low.
Here the divergence term used in (2) can be computed as: . In our experiment, we set . We set the initial state distribution as an uniform distribution . We run each simulation for 50 time steps with time duration to collect the trajectories.
c.7 6D Quadrotor with an NN controller
Consider a 6D quadrotor [everett2021efficient]:
(23) 
where the state vector contains 3D positions and velocities , is the gravity (set to ), and the control is from the output of an NN controller taking the state vector as the input [everett2021efficient]. Here the divergence term used in (2) can be computed as: . Similar to [everett2021efficient], we set the initial state distribution as an uniform distribution . We run each simulation for 12 time steps with time duration to collect the trajectories.
c.8 Adaptive cruise control system
Consider a learningbased adaptive cruise control (ACC) problem with plant dynamics [tran2020nnv]:
(24) 
here denotes the relative distance from the leading vehicle to the ego vehicle, and denote the velocity of leading and ego vehicles and and denote the corresponding acceleration rates of the two vehicles ( models the change in the leading vehicle’s acceleration rate, similar to the MATLAB implementation in [tran2020nnv]). And the controller is taking the relative distance, velocity, and ego vehicle’s velocity as input and outputs the change in the ego vehicle’s acceleration rate. We model the velocity perception uncertainty as and pass it through the neural network. Here the divergence term used in (2) can be computed as: . We set the initial state distribution as an uniform distribution
Comments
There are no comments yet.