Elliptical Slice Sampling for Probabilistic Verification of Stochastic Systems with Signal Temporal Logic Specifications

by   Guy Scher, et al.
cornell university

Autonomous robots typically incorporate complex sensors in their decision-making and control loops. These sensors, such as cameras and Lidars, have imperfections in their sensing and are influenced by environmental conditions. In this paper, we present a method for probabilistic verification of linearizable systems with Gaussian and Gaussian mixture noise models (e.g. from perception modules, machine learning components). We compute the probabilities of task satisfaction under Signal Temporal Logic (STL) specifications, using its robustness semantics, with a Markov Chain Monte-Carlo slice sampler. As opposed to other techniques, our method avoids over-approximations and double-counting of failure events. Central to our approach is a method for efficient and rejection-free sampling of signals from a Gaussian distribution such that satisfy or violate a given STL formula. We show illustrative examples from applications in robot motion planning.



page 9


Leveraging Classification Metrics for Quantitative System-Level Analysis with Temporal Logic Specifications

In many autonomy applications, performance of perception algorithms is i...

Elliptical slice sampling

Many probabilistic models introduce strong dependencies between variable...

Unbounded Slice Sampling

Slice sampling is an efficient Markov Chain Monte Carlo algorithm to sam...

Reactive and Risk-Aware Control for Signal Temporal Logic

The deployment of autonomous systems in uncertain and dynamic environmen...

Robustness-Driven Exploration with Probabilistic Metric Temporal Logic

The ability to perform autonomous exploration is essential for unmanned ...

Average-based Robustness for Continuous-Time Signal Temporal Logic

We propose a new robustness score for continuous-time Signal Temporal Lo...

Risk of Stochastic Systems for Temporal Logic Specifications

The wide availability of data coupled with the computational advances in...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

To deploy autonomous robots, such as self-driving cars or assistive robots, we seek formal guarantees that they can operate safely and reliably. Providing such guarantees is challenging due to the sheer amount of non-determinism in the world including noisy sensors, uncontrolled environment (humans, other robots) and different environment conditions (such as lighting, occlusions, etc.). Modern systems also include machine learning components (dean2020robust) that can contribute to uncertainty since they might be deployed in different settings than the ones they were trained on.

Sensors, from proprioceptive ones that sense the robot’s internal values such as speed or joint angles, to exteroceptive ones that sense the environment such as range finders and cameras are usually modeled with errors coming from a Gaussian distribution or bounded noises. The system designer needs to reason about the likelihood that the system will successfully perform the task and re-design it if needed. The general approach is to find all the states (e.g. the robot’s positions) that the robot may reach under all circumstances, i.e. the “reachable set”, and reason about the safety and task completion. Testing with hardware is limiting, impractical and intractable because of the variability of tests and environmental conditions. Finding rigorous formal mathematical guarantees is usually infeasible for complicated systems performing complex tasks. Verifying systems using simulations may be the only way, but they also suffer from long computation times, especially when searching for rare and hard to find events (chou2018using; o2018scalable; yu2018safe).

Several techniques exist for verifying systems with uncertainty in the literature. Imposing hard constraints on the state will always result in violation when dealing with unbounded non-determinism such as the Gaussian noise model. As such, it makes sense to describe the constraints with the probability of satisfying them - probabilistic state constraints. One common approach to verifying such robotic systems is with chance constraints (du2011probabilistic; blackmore2009convex). In these formulations, it is common to do risk allocation and use Boole’s inequality, which allocates the level of uncertainty for each constraint component (ono2008efficient), or use ellipsoidal approximations. However, both are considered to be conservative (blackmore2009convex; kariotoglou2013approximate), as they over approximate failure probabilities. A known issue with these approaches is “double counting”. A constraint violation of a trajectory at time might yield a violation at as well, and they will be considered as two separate violations because of the way they are constructed. In reality, we would like to consider that trajectory as just one failing trajectory.

Another common approach is to use Monte-Carlo methods to verify systems. These methods are very attractive because they can be applied to non-linear systems, intricate noise models and black-box simulators. However, they can be computationally inefficient, especially when trying to detect rare events (driveteslacanada:2021). The verification process needs to iterate through many simulations or even guided simulations to find rare events (o2018scalable; verifai-cav19; schmerling2016evaluating)

in order to produce an accurate estimate of the probability. Other more generic techniques also exist to improve the performance of a Monte-Carlo simulation 


An issue with Monte-Carlo simulations is when there are many non-deterministic parameters  (gessner2020integrals); in such cases, Monte-Carlo techniques may require a prohibitively large number of simulations to adequately represent the posterior distribution. In our case, a dynamic system with multiple noise sources and a long time horizon can grow to a large parametric space quickly. We, and the work in (gessner2020integrals) which we extend upon, show that our method can yield accurate integrations for Gaussians in high dimensions independent of the probability mass of the posterior.

Optimization techniques have been used extensively in the literature to verify systems with uncertainty. There exist numerous verification algorithms that deal with machine learning components in the loop (e.g. (tran2020verification)). The authors in (xiang2018reachability)

consider dynamic systems with a neural network component as the controller. In that setting, the inputs to the neural network are discretized and a linear program over-approximates the output. With that, they compute the over-approximation of the complete system’s reachable set. Optimization can even help detect or attenuate cyber-physical attacks where an attacker can inject noise to a sensor to affect the system’s outcomes through the controls 

(murguia2017reachable). This assumes that bounded noises are selected carefully to achieve the attacker’s goal while avoiding detection, and thus would be over-conservative in a non-adversarial setting. In (dean2020robust), the authors use robust control theory to provide guarantees for a system where the perception errors can be bounded using some assumptions on the data used during the training process versus the data that is collected in real-time.

Another line of work can be categorized as geometric algorithms. Set propagation techniques have been applied to reachability analysis. Except for a limited number of systems, these techniques always deal with under or over-approximations because finding the reachable sets is undecidable (althoff2021set). These techniques provide efficient computation frameworks; however, they work only on uni-model disturbances such as a Gaussian model, or a bounded disturbance. When discussing systems with a large number of states, one must employ other methods, such as decomposition of the system dynamics, for the methods to be tractable. The authors in (althoff2016combining) combined zonotopes and support functions to create an efficient framework for calculating the reachable sets of linear and switched dynamics systems. It considers only bounded disturbances.

In this work, we focus on the verification of properties that can be expressed using Signal Temporal Logic (STL) (donze2010robust) formulas for linear or linearizable time-variant robotic systems with Gaussian error models. We show how our verification technique performs with a Gaussian mixture noise model (Section 4.6), where the weights or probabilities of each Gaussian could be either static, come from a choice model like a Markov chain, or from a black-box choice model. We provide a verification method for generic STL formulae. We also describe a special case of reach-avoid (fan2018controller) type specifications for which we propose an alternate solution that, in some cases, is more computationally efficient. We leverage and extend the framework in (gessner2020integrals) to compute the probability that the robot satisfies or violates its task specification.

Our main contribution is a computation framework for verifying and computing the probability of a high-dimensional system to satisfy (or violate) complex STL specifications within a finite horizon using the STL quantitative semantics. The technique is especially useful (accurate and tractable) when dealing with low probability events and displays the following properties: 1. We provide an efficient computation framework that does not suffer from the combinatorial nature of representing the signals that satisfy an STL specification. 2. Failure modes are not double-counted and not over-approximated. The computational framework is solved efficiently and can be parallelized. 3. Sampling is done from the posterior distribution in a rejection-free manner. Thus it is sampling efficiently from the target distribution. 4. We can efficiently sample new trajectories from the failing or succeeding trajectory sets for analysis purposes, control synthesis, etc. 5. The algorithm is parameter-free. Meaning, no fine-tuning of hyper-parameters is required. 6. It can verify systems with more intricate noises than Gaussian errors thus capturing realistic perception models.

2. Preliminaries

In this section, we provide the necessary background on elliptical slice sampling and STL specifications.

2.1. Elliptical Slice Sampling (ESS) and the Holmes-Diaconis-Ross (HDR) algorithm

An adaptive elliptical slicing method is used to sample from a linearly constrained domain under Gaussian distributions in (gessner2020integrals). We show the main concept and idea here for clarity and completeness. In this paper we extend (gessner2020integrals) to compute the probability that the robot trajectories, represented as a multivariate Gaussian, satisfy or violate a specification.

Elliptical slice sampling (ESS) (murray2010elliptical) is a Markov Chain Monte Carlo technique (MCMC) for sampling from a posterior when the prior is a multivariate Gaussian . In our case, the posterior will be a Gaussian under constrained linear domains (a truncated Gaussian). Given a single sample inside the linear constrained domain , and a new auxiliary point sampled from the same Gaussian , the approach constructs an ellipse , parameterized by the scalar

. Using a closed-form solution to the intersections between the auxiliary ellipse and the hyperplanes that confine the linear domain

, we can sample

from a Uniform distribution over the ellipse arc lengths that lie within the domain, and thus obtain a new sample

. A point on the ellipse is in the domain , when the intersection between all constraints exceed zero, where . This process is depicted in Fig.0(a) where the new sample is sampled from the constrained Gaussian distribution (for proof, see (murray2010elliptical; gessner2020integrals)).

The Holmes-Diaconis-Ross (HDR) algorithm (diaconis1995three)

is one method from a family of algorithms called multi-level splitting, for estimating the probability of sampling from a constrained region under any distribution. Direct Monte-Carlo methods may be inefficient because most candidate samples may be rejected (low probability distribution function or high dimensional domain). With HDR, the probability

of sampling from is estimated using the product of conditional probabilities:


where . Each domain (also referred to as a nesting) is shifted (enlarged, see Fig.0(b)) to by a scalar such that the conditional probabilities and is exactly the target domain . Fig.0(b) depicts this process where the target domain (blue grid) is expanded until it contains enough samples - when the probability to sample from the shifted region is about 0.5. Then, the algorithm iteratively shrinks the shifted region to keep the proportion of samples within the new domain to the previous domain at about half. samples are drawn from each domain with the ESS algorithm. The probability , is the ratio between the number of samples ( is the indicator function, equals one if the argument is true, zero otherwise) to the total number of samples, , drawn at that nesting.

We note that closed-form solutions to the integral of a Gaussian under a linear constrained domain does not exist in the general case, when the domain is not axis-aligned with the Gaussian. Numerical methods, such as quadrature algorithms, do not scale well with the dimensionality of the problem (orive2020cubature).

(a) Elliptical slice sampling
(b) HDR
Figure 1. (a) Sampling a new point from the constrained domain (blue grid) given an initial point , an auxiliary point and . The active intersection is the bold blue line section of the ellipse where all points within are in the linearly constrained domain. (b) Original constrained domain in the blue grid and shifted domain in yellow divot after producing samples from using the ESS procedure under .

2.2. Signal Temporal Logic

Signal temporal logic (STL) (maler2004monitoring) enables specifying a broad range of temporal constraints over real-valued signals. Here we consider STL for discrete-time signals. Continuous-time logics and their properties can be found in, e.g., (fainekos2009robustness; donze2010robust).

Consider a discrete-time real-valued signal , where . A predicate over is denoted by , where . A predicate is called linear if is an affine function of . Given a set of predicates, STL formulae are defined recursively using the following operators:


where are STL formulae, is the negation operator, are conjunction and disjunction, respectively, and are bounded temporal operators, over the time interval , that stand for “until”, “eventually”, and “always”, respectively.

Example 2.1 ().

Consider a signal with values in , where . The specification

encodes “for all times in the interval [0,9], the value of stays above , or, for some time in the interval , the value of stays below for consecutive time steps”.

Definition 2.2 ().

The STL score, or quantitative semantics (donze2013), is recursively defined as:

  • [leftmargin=*]

  • ,

  • ,

  • ,

  • ,

  • ,

  • ,

  • .

The STL score provides a metric for distance to satisfaction for a signal and STL formula. A positive STL score indicates satisfaction and a negative one stands for violation. To remove ambiguity, we consider the STL score of as satisfying. We define the STL score of a signal and specification as .

Example 2.3 ().

In Example 2.1, let . It is evident that it satisfies . Applying Definition 2.2, we obtain


thus . We say signal satisfies and its STL score is 3.

Definition 2.4 ().

The -level set of an STL formula is defined as:

Definition 2.5 ().

The horizon of the STL formula , denoted by , is the minimum length of truncated signal such that is required to evaluate and it is recursively given by:

  • [leftmargin=*]

  • ,

  • ,

  • ,

  • .

Example 2.6 ().

In Example 2.1, the horizon of the formula is . The values of do not affect .

Given , we only need the truncated signal to check whether it satisfies

. Thus, we can stack the truncated signal into a vector denoted by

. With a slight abuse of notation we extend the STL score and level-set definitions to the following function and set in :


It is straightforward to show that given with linear predicates on , we have the following properties:

  • The function is piecewise affine and Lipschitz continuous.

  • For a given , the set is a union of polyhedra in .

3. Problem setup

3.1. System

We consider discrete linear(izable), possibly time-varying, systems (LTV) with the dynamic and measurement equations:


where is the state at time , is the control input and is the measurement vector. The process noise and the measurement noise are described in more detail in Section 3.2. and are the relationships between the states and measurements and are assumed known. The discrete system has a time step size of . The system can have a linear state observer, and a closed loop feedback controller for tracking a reference trajectory :


or, directly using the measurement for feedback:


3.2. Noise model

In this paper we focus on Gaussian errors, and . We assume that all the noises are independent and identically distributed (iid). Note that one can augment the system’s states if the noise is colored.

In Sec. 4.6 we discuss a more intricate noise model, where the noise is modelled as a Gaussian mixture, meaning where is the probability of choosing Gaussian distribution (similarly for ). While a single Gaussian is a special case of the mixture, we separate the discussion because we can provide stricter guarantees for this case.

3.3. Specification

We consider STL specifications where the underlying signal is the system trajectories:


We limit ourselves to linear predicates on the system’s state in the form of . The assumption of linearity is essential since later in the paper we will use a closed-form solution for intersections of an ellipse and a hyperplane (Section 4.4). While it is possible to consider specific forms of nonlinear predicates and still retain closed-form solutions, we leave that to future work.

Example 3.1 ().

A common STL formula is reach-avoid. Consider a continuous-time system with a time horizon with an STL specification of this structure:


where and defines a polyhedron in the state-space with hyperplanes each represented by a linear predicate. Similar notation is used to define sets of polyhedra for the unsafe sets (e.g. obstacles) and the goals. In words, the system satisfies the specification when it is able to start in the set defined by , avoid all obstacles for the entire trajectory and reach each at some . Given and , a trajectory of the system contains discrete time steps. To be able to correctly verify the specification , we require .

An advantage of our approach is the ability to efficiently address the combinatorial aspect of all possible trajectory classes that may satisfy or violate the specification without double counting them.

3.4. Problem formulation

Problem 1 ().

Given a linear system in the form of (6)-(9), a Gaussian (mixture) noise model and an STL formula with linear predicates over , find the probability that is satisfied.

4. Approach

We illustrate our approach through an example of a holonomic robot navigating in a workspace (Fig. 2).

Example 4.1 ().

A holonomic robot’s state is with the discrete-time dynamics:


We use the discrete Linear Quadratic Regulator (LQR) algorithm (kwakernaak1972linear) with the desired to compute the optimal controller . We assume full-state measurement:


The noise

is normally distributed and

is omitted for brevity. We consider an arbitrary STL specification for the rest of the section unless otherwise specified.

4.1. Integral over Trajectory Space

The first step is to derive the relationship between the process and measurement noises to the trajectories by turning the trajectories into Gaussians in a higher dimensional space . In our robot example, this means the concatenated states for every time step in the horizon, . We consider and and without loss of generality to simplify the following expressions. Based on Eq. (6)-(9), we can express the full trajectory in vector form with (14) by iteratively substituting the states and controls:


Where , and . are the matrix coefficients that transfer the initial state, the extended reference inputs and measurement noises to the full trajectory, respectively. All components in (14) are deterministic except for the stochastic according to the sampled noise mode:


We can extract the multivariate Gaussian in the trajectory space which is the distribution over which we integrate:


. In a similar manner, it is possible to derive the Gaussian of a trajectory with both and (and possibly, ).

Other work, e.g. chance constraints that are typically implemented and over-approximated with Boole’s inequality, deal with constraints on the state-level. We work with the full trajectory. This difference is one of the reasons we do not double count events. When computing the probability of failures, the trajectory Gaussian is integrated with respect to the trajectory-level constraints.

The evaluation of the probability in Problem 1 is equivalent to computing the following integral:



is the probability density function of the trajectories, the Gaussian in this case from (

16). is the set where the trajectories satisfy the specification .

4.2. Monte-Carlo Sampling

We propose a guided Monte-Carlo approach for verifying a dynamical system with fixed controls (they can be time-varying but not state-dependent) and Gaussian noise sources over a fixed horizon with respect to STL specification. We represent the full trajectories as Gaussian, and use the HDR and ESS algorithms to integrate the probability density function under the domains that satisfy the STL formula.

The advantages of the approach are threefold: 1. Efficient (rejection-free and parameter-free) sampling of trajectories that satisfy a STL specification and computation of the probability of satisfaction, without over-approximations and double-counting, such as with the use of Boole’s inequality on each separate state in the trajectory. 2. Efficiently finding events with low probability that would be otherwise intractable to compute with naive Monte-Carlo simulations. 3. It enables longer horizons and more random variables without suffering from the dimension explosion problem (the ill-sampling of the posterior distribution).

The first point is achieved by sampling, with ESS, trajectories that are within the set of trajectories that satisfy the specification. The second point is achieved using the HDR algorithm as we can construct the required number of nestings to evaluate the probability. In fact, once all nestings are set up, the sampling time of the rejection-free ESS algorithm is not influenced by the probability mass. Regarding the third point, the variance of the error of the quantity we wish to estimate with a Monte-Carlo simulation

decreases with the number of simulations . However, we cannot accurately estimate the value of from the sampled simulations when we ill-sample the posterior distribution. We do not know the true variance a priori and in fact, the variance itself may increase rapidly as the number of variables increase. Intuitively, there are more combinations of noise errors which may cause the robot to violate the specification and it is harder to sample “useful” (for the purpose of correctly estimating the probability) combinations. Our approach, on the other hand, is sampling rejection-free from the constrained posterior distribution to the requisite level of accuracy.

Figure 2. Example of a robot in 2-D. The initial, unsafe and goal sets are depicted in the figure. Successful and failing trajectories of a -step horizon are shown with an added intermediate point. We define failure trajectories that hit an obstacle, but reach the goal on time as type ; type failing trajectories that do not reach the goal on time. In Section 4.5 we make use of these failures for reach-avoid specifications.

Figure shows an x-y projection for a 2-D navigating robot example. Robot goes from initial set on the left, to the goal set on the right while avoiding an L shaped obstacle in the middle. Three trajectories show - a successful example; an example that fails due to hitting an obstacle (type a); and an example that does not reach the goal in time (type b). Also, a marker that shows an added point between two time steps to increase computation fidelity.

4.3. STL-Score-Guided Elliptical Slice Sampling

Here we describe how we draw sample points from that are inside - trajectories that have an STL score . As mentioned earlier, the naive way is to draw samples from and reject those that fall outside of . However, if the probability mass inside is too small, the procedure will be inefficient as most of the samples will be rejected.

We use ESS as described in Section 2. The explicit representation of - the domain of the integral in (17) - as a union of polyhedra requires an enumeration of all of the possible convex sets. The number of such sets can grow exponentially in the size of the formula (see, e.g., (sadraddini2016feasibility)). We avoid explicit enumeration of the polyhedra in while computing the integral in (17). The key insight is that we only need the STL score function (nivckovic2020rtamt).

Theorem 4.2 ().

Given a STL formula with a set of linear predicates , where is the total number of predicates. Given an existing sample trajectory and free sample trajectory (not necessarily in ), construct the ellipse in . Then construct the following sorted list of real numbers in :


Then for any two consecutive elements (cyclic), one of the following statements is correct:


Proof. In order for , the value inside the function of at least one of the predicates should be equal to - this predicate becomes the maximizer/minimizer in the STL score function. Note that we have as negation might be in the formula. Therefore, the set contains all the roots for - but can contain spurious elements. Since is Lipschitz continuous, is sign-stable on between two consecutive roots. ∎

Theorem 4.2 paves our way to compute portions of the ellipse that fall into by only computing the roots of the robustness function on the ellipse. Furthermore, (18), provides all the candidates with the complexity of solving intersections of the ellipse with a hyperplane, for which closed-form solutions exist (gessner2020integrals). Then, using Theorem 4.2, we can sample within each pair, to assign if is in . See Fig. 3 for an illustration of the adjusted ESS procedure. Thus, the complexity of obtaining the intersection of the ellipse and is , and thus we avoided any exponential blow up due to explicit combinatorial representation of .

Figure 3. Hyperplanes, representing predicates in the STL formula, projected on the auxiliary ellipse. Some may not intersect, and some may intersect but not change thus are in the domain. Dashed lines represent hyperplanes of predicates that are not in the time bounds described in .

The ESS auxiliary ellipse intersected by multiple lines representing the hyperplanes of each predicate in the specification. Dashed lines represent hyperplanes that result from smearing the predicates in time and that will not play a role in the specification itself, because it is out of the specification’s time bounds. The arc that points x1, x2 form on the ellipse are not intersected by any hyperplane, thus maintain the same sign of the robustness. The arc formed by x3,x4 is intersected with a dashed line which also means the sign of the robustness does not change between those points.

4.4. Holmes-Diaconis-Ross for STL

Now that we have a method to draw samples from , we use it for our HDR-based Monte-Carlo method.

4.4.1. Nesting partitioning

To perform HDR where the probability density function is low, we need to account for the multi-level splitting described in the preliminaries. This means that when sampling from the nesting , a larger domain than what we would like to evaluate, we shift to a new value (usually it will be a negative value, allowing more trajectories that violate , where about half the trajectories have robustness greater than ) as the new cutoff level instead of . On the other hand, we also need to shift the linear predicates, to get the new intersections of . For a general specification, we do not know whether to shift the predicates with a positive or a negative due to the structure of the sub-formulas. For example, consider the difference between versus . In , a violating sample would require increasing the domain, while a violating sample on would decrease the domain to make it satisfying. Instead of analyzing each component of the specification, we shift each predicate by and by as shown in (18).

4.4.2. Error Analysis

Monte-Carlo methods by nature give different results every time they are executed. It is necessary to have an estimate on the variance of the computed probability . For the HDR nesting , we sample samples, and as discussed in Section 2, we aim for the conditional probability to be

. In principal, ESS is a MCMC method, thus the samples are by definition dependent and the central limit theorem (CLT) does not apply. To mitigate this limitation, we keep only every

-th sample from the ESS, thus making the dependency between the sampled to practically non-existent (in all our examples we used ). This is sometimes referred to as the “burn-in” phase, and its purpose here is to weaken the dependency between samples. In practice, our experience shows (see Fig. 6) that the dependency is weak, due to the “burn-in” process and sampling independently from a uniform distribution, and the following error analysis applies.

Using the CLT (), we can assess that the variance for nesting is where and is the number of points sampled within . Therefore, for nestings , a good approximation for the variance is , such that . With a slight abuse of notation, we define for clarity. Each is a Gaussian iid, we can compute the variance of the product of the conditionals:


Substituting for our nominal parameters we obtain the approximation:


In practice, we compute the variance with the actual sampled values and not (22). The point is that when the number of points in a nesting is large enough, the variance is proportional to . For example, selecting points per nesting with

yields a standard deviation


4.4.3. Adaptive nesting samples

Using (21), we can compute an expected minimal number of samples for a desired value of . Fig.3(a) shows how increasing the number of samples decreases the uncertainty. Increasing the number of nestings, decreases the uncertainty as well. However, the number of nestings is not a design parameter but rather depends on the problem at hand. The number of nestings is approximately . We can automatically select the number of samples to use per nesting by utilizing (21) and Fig.3(a)

. The benefit of using this is shorter computation times; in problems with a high number of nestings (i.e. low probability), we can get the desired confidence interval with less samples.

Figure 4. (a) The effect of the number of nestings and the number of sampled points on the confidence interval of the HDR algorithm on a nominal . (b) Extending (gessner2020integrals) to a union of polytopes.

4.5. Special case: Reach-avoid specifications

In Eq. (11) we showed a common specification for robotic applications. In some cases, it might be more efficient to construct the union of polyhydra that represent the predicates in explicitly, rather than use the STL score-based ESS and HDR (due to the number of hyperplanes and STL score computations discussed in Section 4.3). Consider finding the probability of violating Example 3.1, i.e. . We define two possible ways of violating (i) type : , “hit an obstacle and reach the goals on time”. (ii) type : , “do not reach the goals on time”. Thus, .

We compute the integral of the Gaussian under the constrained domain by modifying the procedure in (gessner2020integrals); we consider a union of polytopes instead of only one. This means that as long as such that the intersection of all its constraints exceed zero, then it is a valid point in the domain. There may be numerous intersections of the constructed ellipse with the faces of the different polytopes. We extend (gessner2020integrals) (Fig. 3(b)) to find the active segments of the union of polytopes and sample points from the active domain.

We consider discrete time systems; however, there is a gap when verifying the system that is in fact continuous. There could be situations where all the discrete states in the horizon are satisfying the specification, yet the system might collide with an obstacle in between the states. See Fig. 2 for an example near the obstacle.

There are several ways to address this. First, we can either increase the sampling rate or bloat the obstacles. The former will increase the dimension of the problem, while the latter would constrain the problem even more. In both cases, it will increase the computational load by increasing the dimensions in the trajectory space or by reducing the probability mass function under the domains (need more nestings).

The second approach, if we assume constant velocity between two consecutive states (valid in short time spans), we can introduce more constraints without increasing the problem dimensions. These intermediate points will add robustness by adding more area of the polytopes without as many computations as increasing the number of states. It can also be introduced only in parts of the trajectory that are susceptible to failure. Fig. 2 shows an example of a point added in the middle between to . To add constraints for an intermediate centroid point between two trajectory points:

This additive technique can also be applied to compute the if the STL library can compute the score of signals with dense time steps.

4.6. Gaussian Mixture models

Our approach may also be used to verify systems where the underlying noise model is better described with a Gaussian mixture. For example, a common model for range finders is the Beam model (thrun2002probabilistic) (Ch. 6). It incorporates several modes of sensing errors that depend on the physical interaction of the sensor with its environment and may be approximated by a Gaussian mixture. Another example is a camera that is tracking cars but due to occlusions or errors in its neural net, it starts to track clutter or a different car in its field of view. The noise distribution at time might depend on the distribution at , making come from a Markov chain or a black-box choice model:


Where is the number of distributions and can also vary between time steps. In this case, computing the tree of possible combinations of the Gaussian distributions and noises throughout the trajectory and their weights is intractable. However, we suggest a procedure for computing the total probability. The first step samples just the mixing factors from their underlying distributions, for the entire horizon. When the mixing factors are fixed, the problem reduces to Problem 1. We compute the probability and variance, and repeat this procedure for iterations. Then, we compute the unbiased mean estimate and the variance of the iterations. This method still relies on Monte-Carlo simulation to compute the probability and variance estimation. However, only the trajectory modes are sampled, thus reducing the problem’s input dimensions considerably. A full Monte-Carlo simulation will have the modes and the actual values to sample from and can thus be susceptible to the dimension explosion problem.

5. Case studies

5.1. Robot navigation - reach-avoid

We demonstrate verification for Example 4.1. The noises are: , , , and . Fig. 4(a) presents the static obstacles, goal, and the reference trajectory. To increase the fidelity of the simulation, we add intermediate points as discussed in Section 4.5. In the first scenario, the horizon sec with sec. The STL specification:


Where if the intersection of all predicates of over the state is greater than zero. In this case, is non-convex thus we use Delaunay triangulation (lee1980two) to decompose it into two convex polytopes . We have not considered any other restrictions on the state except on the pose.

5.1.1. Setup

We compute . We construct a disjunction between the trajectory-space -polytopes of failing trajectories of type and as discussed in Section 4.5.

5.1.2. Results

Fig.4(a) shows a sample of the trajectories of both failure modes. Computing the probability with our proposed algorithm yields and took sec. Verifying the same system with Monte-Carlo simulations yields and took sec. To estimate the probability of failing with Monte-Carlo, we use simulations, where . The experiments were done on an Intel(R) Core(TM) i7-6700 and samples in each nesting of the HDR algorithm (to get a comparable standard deviation). Monte-Carlo yields faster results because the probability to fail is relatively high.

Using a horizon of steps yields a simulation of random variables which can be considered a relatively small parameter space. Fig.4(b) depicts the second scenario with similar settings (initial conditions were changed to induce failures because the LQR controller with the new time step performs differently) for a horizon of sec and sec. This time, the problem dimension is . With our algorithm, and took sec to complete with . Monte-Carlo simulation takes sec and yields in simulations. Fig.6 shows the distribution of running our algorithm and MC times with different seeds and the results match for , depends on .

(a) Scenario : sec.
(b) Scenario : sec.
Figure 5. X-Y projection of a robot maneuvering in a field with obstacles (red) and goal (green). In orange, the failing trajectories of type (). In gray, the failing trajectories of type (). In black, the reference trajectory.
Figure 6. Running our approach (HDR) and MC (mean of runs) a 100 times each, and the comparison of the mean estimation and its standard deviation.

Figure on the left is a histogram of a 100 runs of Monte-Carlo and 100 runs of our method. It shows that on average, both techniques yield the same results p=2.5 in the robot navigation problem (for the 200-dimension trajectories). On the right, a histogram of the variances of each run and each method. The difference in the mean is 0.1%. The width of our method in this case is slightly bigger.

5.2. Car passing an intersection - reach-avoid

5.2.1. Setup

We consider a controlled car (Ego-vehicle, ) driving along the -axis and an uncontrolled car (Other vehicle, ) driving along the -axis, as shown in Fig. 6(a). Each car’s dynamic equations follow the holonomic robot in (12). Since is uncontrolled we model its dynamics with a process noise . is measuring the distance to using a Lidar and has errors (thrun2002probabilistic). The error modes are a Gaussian about the true value, and a maximum range error . The transitions between the Gaussians are expressed with a Markov chain , , , indicating the probability of having a bad measurement after a previous bad measurement is higher (occlusion, multipath). needs to cross the intersection safely and uses the control law: . The time horizon is sec and sec. The cars’ lengths are m and widths m. and such that when the distance between the cars , the control yields and stops until crosses the intersection.

We derive a new state variable with the initial conditions , as shown in Fig. 6(a). In this new state variable, it is easy to show that the unsafe set (the “obstacle”) is a square centered at the origin of where the lengths of all the sides are . The goal is for to cross to the other side, . The polytope sets are shown in Fig. 6(b). The STL specification:


Here, the measurement equation is non-linear. We find the trajectory’s Gaussian distribution by linearizing the distance measurement in (26) evaluated at the expected value of the state, (27).

Figure 7. (a) Controlled car (yellow, left) entering an intersection with an uncontrolled car (black, bottom). (b) A satisfying trajectory (), and a violating trajectory () in the Z coordinate frame. (c) A single noise sequence example of failing trajectories that intersect with the unsafe set.

5.2.2. Results

Fig. 6(c) shows the failing trajectory samples of one iteration of sampled noise mixing factors (Sec. 4.6). Total probability to fail with confidence level . We compare with Simple Random Sampling (SRS) with for the Monte-Carlo simulations of the full non-linear system. The probability estimate took sec to run, while using our method took sec (again, the times are due to the high probability of failure).

5.3. Data-based simulation - reach-avoid

In this example we show how this technique can be used in a scenario where the noises or system dynamics are not known. For this demonstration we run the Jackal (jackal) robot in the Gazebo simulator (gazebo:2020) with the Robot Operating System (ROS) (ROS). We use the built-in controllers and estimation algorithms provided with and for the robot, and send it a goal command. With probability of , a maximum range noise (thrun2002probabilistic) is injected to any of the Lidar’s ray measurements. Fig. 8 shows the environment the robot is navigating through. Our purpose is to verify that the system can reach the goal safely. However, running a single run takes approximately sec, thus making a Monte-Carlo simulation intractable when the failure rate is low.

In our approach, we first run simulations and fit a multivariate Gaussian (e.g. robustcov in Matlab) to the set of (ground truth) trajectories. must be at least twice the number of variables (states x time steps). We now have and we directly compute the probability to collide with a tree, miss the goal or violate any other temporal constraint.

In Fig. 9 we show the verification results for the system with

where is the region defined by the box . The time step in this scenario is sec and a horizon of sec. We see that of the trajectories fail to reach the goal on time (or overshoot it). We stopped the computation of the probability of hitting a tree (“”) at nestings, which means that a crash is less likely than about .

Figure 8. Jackal navigating in the environment (left). Lidar measurements and robot’s current mapping (right).
Figure 9. Data-based verification - comes from simulations. of the trajectories are not at the goal at sec.

In the figure, a 2-d view of the workspace of the Gazebo example with the results. Only type b failures are available, and they show how the goal set is not reached in the specification time-frame.

5.4. Robot navigation - Full STL

In this example we consider the robot in Example 4.1 and a complex STL specification:


where . Since our technique computes the probability of satisfying the STL formula, to find the probability of failure, we use in our computations. Following a single run of our method, we are able to find violating trajectories (Fig. 9(a)) even though . To find just one event with this probability we would need to run approx. simulations with MC. We ran trials with our technique, and trials with MC with simulations each. The results are shown in Fig. 9(b). of the MC runs end with no failing examples, and about a third end with one failing example. The mean time to run MC is sec and our method is sec. The minimal probability computed by our method is .

Due to the use of the STL score for full STL, one cannot identify the specific cause of the failure. Furthermore, the sampled trajectories that fail the specification do not necessarily represent the proportions of the different failure causes. This is due to two reasons - first, we cannot guarantee how many trajectories are present in the final nesting, as explained in Section 2. Second, because this is a MCMC approach, the samples might be biased towards a certain region given an initial sample within that region. However, we show that if we sample new trajectories, we will get the correct proportions on average given enough samples. For example, in the previous scenario, the ratio between the probability mass for hitting the obstacle at and not making the second goal on time, is approximately 4:1. We ran our approach times. After each iteration finished, we sampled five sets of samples that violate and computed how many of those hit the obstacle and how many violated the goal requirement. Results are shown in Fig. 11; although at specific instances we can even get more trajectories of goal violations than obstacle violations, we see that on average, we sample the correct proportions. This means that with our method, we are able to “jump” from an active domain to another active domain even if it is clearly distinct (different predicates and different time bounds). Of course, regions may be overshadowed by regions with considerably higher probability mass and if one wants to check those too, then they might need to decompose the specification to capture only those.

Figure 10. (a) Failing trajectories of the STL formula . (b) Statistics for a 100 trials for HDR, and for 100 MC, each with simulations.

5.5. Adversarial Scenarios - Full STL

In (yang2021synthesis) the authors developed a synthesis guided approach to find adversarial examples that falsify a dynamical system with respect to reach-avoid type specifications. An example from that paper (Example 2) finds a series of measurement noises that causes the system (29) and its regulator to enter the unsafe zone.


In (yang2021synthesis), the noises and are uniform and bounded. Here we approximate them with an appropriate Ga