## 1 Introduction

Real-time decision-making for robotic tasks is a challenging problem, which usually requires prediction of possible outcomes in an incremental framework, based on available partial signals over time. The accuracy of such incremental predictions determines the efficiency of mitigating the occurrence of undesired behaviors. This is crucial in different fields, such as autonomous driving, e.g., in the case when a self-driving car does not have a clear vision over some objects in the environment and it has to take actions based on the performance of nearby cars, by observing their behavior over time. For example, consider the urban-driving scenario in Fig. 1. It consists of an autonomous vehicle, referred as ego, a pedestrian, and another car with a driver. The vehicles are headed toward an intersection with no traffic light, and there is an unmarked cross-walk at the end of the ramp-shape road, before the intersection. There are two possible types (labels) for the behavior of the other car: aggressive or safe. An aggressive driver keeps the same acceleration while moving in its lane, no matter whether the pedestrian crosses the street or not. A safe driver brakes slightly when it reaches the pothole, and if the pedestrian crosses the street, the driver applies full-brake to stop behind the intersection; otherwise, it keeps moving with the same acceleration.

Predicting the behavior label of the human driver is valuable, especially in the case that ego does not have a clear line-of-sight to the pedestrian (because of the uphill shape of the road or presence of the other vehicle). In this situation, if the human driver is predicted as a safe driver, one possible strategy for ego is to follow its actions. In this paper, we focus on a time-incremental learning framework, where a dataset of the other car’s behaviors (e.g., position and velocity) and their labels (safe or aggressive) that are recorded offline, is provided to ego. The goal is to develop a method allowing ego to predict whether the human driver is aggressive or safe, by observing its behavior on the fly.

For the same scenario, suppose the time horizon of the scenario is . A dataset , consisting of signals with time length and their corresponding labels , is provided to ego. The signals are the recorded behavior of the other car over time and the labels represent the behavior type of the human driver. This is a two-class classification problem, and the goal is to develop a method allowing ego to classify the real-time behavior of the other car, represented by the prefix signal . In this paper, to provide interpretable specifications, we use Signal Temporal Logic (STL) [maler2004monitoring] to specify the classifiers.

We propose a novel framework to solve such a classification problem, which consists of three main parts and each part takes the given dataset as an input (see Fig. 2 (a)). The first part, named ”Signal Analysis”

, analyzes the signals and applies a heuristic method to find a finite number of timepoints along the horizon of signals, referred to as

decision times. The decision times are the timepoints that are potentially informative for separating the signals into two classes, and they are considered as candidate timepoints for generating classifiers. The next part, ”Classifier Learning”, is responsible for generating classifiers at each decision time, using the given dataset. Here we use decision trees [breiman1984classification], [ripley2007pattern] to learn the binary classifiers from data, and describe them by STL specifications. Each STL specification captures the temporal properties of the dataset over time. Lastly, by using Neural Networks (NNs) in the ”Classifier Evaluation” part, we assign time-variant weights to the STL formulas, based on their classification performance for the prefix signals defined over the given dataset. The weighted conjunction of the STL formulas, interpreted as a weighted STL (wSTL) formula [mehdipour2020specifying], is considered as the output of ”Classifier Evaluation”, and the predictor is constructed based on that. The effectiveness and prediction power of our framework are evaluated on the above urban-driving and a naval surveillance case studies.Note that a trivial solution to this problem is to apply an offline supervised learning method to the given dataset, learn an STL formula, and construct a monitor

[maler2004monitoring] to predict the label of prefix signals. The main limitation of this approach is that the output of a monitor is inconclusive for the prefix signals with time lengths shorter than the horizon of the learned STL formula, which makes it impractical for the time-incremental learning framework. Our approach addresses this limitation by providing predictions for all timepoints along the signal horizon.### 1.1 Related Work

Understanding the performance and detecting the desired behaviors of robotic systems from their execution traces have attracted a lot of attention recently. Most approaches are based on Machine Learning (ML) from data. Existing ML techniques usually construct a high-dimensional surface in the feature space, but they do not provide any insight on the meaning of the surfaces, which is particularly important for decision-making and prediction. This limitation has been recently addressed by integrating formal methods, and expressing classifiers as temporal logic formulas

[clarke1986automatic] in [bartocci2014data], [mohammadinejad2020interpretable], [bombara2016decision], [xu2019information], [hoxha2018mining], [jha2019telex], [ketenci2019synthesis], [jin2015mining], [neider2018learning], [aasi2021classification]. Initial attempts for learning temporal logic properties from data focused on finding the optimal parameters for fixed formula structures [bakhirkin2018efficient], [bartocci2015system] [asarin2011parametric], [hoxha2018mining], [jin2015mining]. To learn both formula structure and its parameters, supervised classification methods are proposed, such as [kong2016temporal] based on lattice search technique, and later in [bombara2016decision] based on decision tree algorithm. There are other approaches in literature for learning temporal logic formulae, i.e., clustering [vazquez2017logical], [bombara2017signal], uncertainty-aware inference [baharisangari2021uncertainty], swarm STL inference [yan2019swarm], mining environment assumptions [mohammadinejad2020mining], and active learning

[linard2020active].In [aasi2021classification], we proposed a boosted learning method, called *Boosted Concise Decision Trees (BCDTs)*, for learning STL specifications, to improve on the classification performance and interpretability of existing works. The BCDT combines a set of shallow decision trees i.e. *Concise Decision Trees (CDTs)*, which are empowered by a set of techniques to generate simpler formulae. The final output of the method is provided as a wSTL formula, which consists of the STL formulae from each CDT and their corresponding weights. Recent work [yan2021neural]

proposed a NN based method to learn the weights of a wSTL formula, from a given dataset. The neurons of their network correspond to subformulas and the output of neurons corresponds to the quantitative satisfaction of the formula.

## 2 Preliminaries

Let , , represent the sets of real, non-negative real, and non-negative integer numbers, respectively. Given , we abuse the notation and use . A discrete-time signal with time horizon is a function that maps each discrete time point to an

-dimensional vector

of real values. We denote the components of signal as , and the prefix of up to time point by . Let denote the label of a signal , where and are the labels for the positive and negative classes, respectively. We consider a labeled dataset with data samples as , where is the signal and is its corresponding label. A prefix dataset with horizon , is the dataset consisting of prefix signals with horizon and their labels, denoted by . The cardinality of a set is shown by , an empty set is denoted by , and a vector of zeros is denoted by .Signal Temporal Logic (STL): STL was introduced in [maler2004monitoring] to handle real-valued, dense-time signals. Informally, the STL specifications used in this paper are made of predicates defined over signal components in the form of , where is threshold and , which are connected using Boolean operators, such as (negation), (conjunction), (disjunction), and temporal operators, such as (always) and (eventually). The semantics of STL are defined over signals. For example, formula means that, for all times 2,3,4,5, component of a signal is less than 4, while formula expresses that at some time between 3 and 10, becomes larger than 4.

STL has both qualitative (Boolean) and quantitative semantics. We denote Boolean satisfaction of a formula at time by . For the quantitative semantics, the robustness degree [donze2010robust], [fainekos2009robustness], denoted by , captures the degree of satisfaction of a formula at time by a signal . For simplicity of notation, we use and as short for and , respectively. Boolean satisfaction corresponds to non-negative robustness (), while violation corresponds to negative robustness (). The minimum amount of time required to decide the satisfaction of a STL formula is called its horizon, and is denoted by . For example, the horizons of the two example formulas and given above are 5 and 10, respectively.

Parametric STL (PSTL): PSTL [asarin2011parametric] is an extension of STL, where the threshold in the predicates and the endpoints and of the time intervals in the temporal operators are parameters. A specific valuation of a PSTL formula under the parameter values is denoted by , where is the set of all possible valuations of the parameters.

Weighted STL (wSTL): wSTL [mehdipour2020specifying] is another extension of STL with the same qualitative semantics as STL, but its robustness degree is modulated by the weights associated with the Boolean and temporal operators. In this paper, we focus on a fragment of wSTL, with weights on conjunctions only. For example, for the wSTL formula , for the case of , its robustness is computed as , where the weights capture the importance of each formula in computing the robustness.

## 3 Problem Formulation and Approach

We first provide some definitions for the problem formulation.

Predictor: The predictor is a function that maps prefix signal to a label , which represents the satisfaction prediction of at time with respect to the STL formula . , if ; otherwise, .

Timepoint MisClassification Rate (TMCR): We define the TMCR at time step , with respect to predictor , as below, where :

Incremental MisClassification Rate (IMCR): The IMCR is defined as the vector of TMCR values over the time horizon of signals, denoted by:

###### Problem 1

Given a labeled data set , find an STL formula and its corresponding predictor , such that the is minimized.

Our approach to Pb. 1 is illustrated in Fig. 2 (a). Our framework consists of three main components: (i) ”Signal Analysis”, described in Sec. 4, applies a heuristic signal analysis method on the given dataset to find a finite number of potentially informative timepoints, referred as decision times and denoted by the set ; (ii) ”Classifier Learning”, described in Sec. 5, generates an STL formula for each decision time in by using a decision tree method. The set of generated STL formulas is denoted by ; (iii) ”Classifier Evaluation”, explained in Sec. 6, assigns a time-dependent weight distribution to the STL formulas in , which captures the prediction performance of each STL formula over time. The output of this part is considered as the wSTL formula and a predictor is constructed based on that.

We refer to our approach as a framework, because we provide a general framework consisting of a class of algorithms, where each algorithm can be replaced by other methods in literature, i.e., the heuristic method used in ”Signal Analysis” part can be replaced by other signal analysis techniques in the literature, to find the decision times.

## 4 Signal Analysis

Given the dataset , the main role of the ”Signal Analysis” part is to analyze the signals of the dataset over time and find a finite number of timepoints, referred to as decision times. The decision times, denoted by , are the timepoints that are potentially informative for classifying the prefix dataset into two classes, and they are considered as candidate timepoints for generating classifiers. Here we propose a heuristic method, based on the distance function among the signals, to find the decision times.

A commonly used distance function for the case of multi-dimensional time-series data is the Euclidean distance. For two signals and , the Euclidean distance [bombara2017signal] is defined as . Inspired by this, we define the timepoint Euclidean distance as . In the dataset , the set of signals with positive labels are indexed by and with negative labels by , respectively, such that . We define the positive-negative distance in the dataset as:

(1) |

We use (1) as a metric to evaluate the separation between the positive and negative labeled signals from the dataset over time. As a simple, easy to compute heuristic method, we consider the decision times as the timepoints that the first- or second-order discrete derivatives of the function in (1) are zero. Intuitively, these timepoints are the times that the positive and negative signals are locally at the furthest or the closest distance from each other (first-order derivatives), or the switching points for the evolution of the positive-negative distance over time (second-order derivatives). Also, we consider the horizon of signals as a decision time, to evaluate the whole traces of signals over time. The set of decision times is denoted by , where . In Fig. 2 (b), the evolution of the positive-negative distance is depicted over time for the naval surveillance case study, described in Sec. 7.2. Note that a trivial solution to Pb. 1 is to generate classifiers at every timepoint along the horizon of the signals, but obviously this is highly inefficient and computationally expensive. We compare the efficiency and the prediction accuracy of this trivial solution and our heuristic method in Sec. 7.

## 5 Classifier Learning

The next component, ”Classifier Learning”, takes as input the set of decision times from the ”Signal Analysis” part, in addition to the dataset . ”Classifier Learning” is responsible for generating classifiers at each decision time on the prefix dataset . To provide interpretable specifications for the classifiers and inspired by [aasi2021classification, bombara2016decision], we use the decision tree method in Alg. 1 to construct the classifiers. For each decision time , Alg. 1 is used to grow a decision tree , where each is translated to a corresponding STL formula . The structure of the method and its details are presented in [bombara2016decision], and here we explain it briefly.

The algorithm has three meta parameters: 1) PSTL primitives , where we use the first-order primitives described as , 2) impurity measure , where we use the extended misclassification gain impurity measure from [bombara2016decision], and 3) the stopping conditions , where we stop the growth of the trees when they reach a given depth. Alg. 1 is recursive and takes as input (1) the prefix dataset , (2) the path formula to reach the current node , and (3) the current depth level .

At the beginning of the algorithm, the stopping conditions are checked (line 4), and if they are satisfied, a single leaf that is assigned with label is returned (lines 5-6), where captures the classification quality based on the impurity measure. If the stopping conditions are not satisfied, an optimal STL formula among all the possible valuations of the first-order primitives is found (line 7), denoted by , and assigned to a new non-terminal node in the tree (line 8). Next, the prefix dataset is partitioned according to the optimal formula, to the satisfying and violating prefix datasets and , respectively (line 9), and the construction of the tree is followed on the left and right subtrees of the current node (lines 10-11). Note that the main difference of Alg. 1 and the method in [bombara2016decision], is that the decision tree constructed by Alg. 1 is based on the prefix dataset for each decision time . Each decision tree is translated to a corresponding STL formula , using the translation method in [bombara2016decision]. The output of the ”Classifier Learning” is the set of STL formulas , where .

## 6 Classifier Evaluation

The final component of our framework, ”Classifier Evaluation”, takes as input the given dataset and the set of generated STL formulas . ”Classifier Evaluation” assigns a non-negative, time-variant weight distribution to the formulas, denoted by , based on the classification performance of the formulas over time. In Alg. 2, we present a method to find the weights of the formulas over time. With slight abuse of notation, we denote the column vector of the weights at time by , which is a array. In Alg.2, we desire to find the dimensional matrix that includes the vectors of the weights over time.

First, we introduce some notations: at each time step , the subset of the formulas in that have a horizon less than or equal is denoted by , and the rest of the formulas that have higher horizon are denoted by . Note that and . The main reason for this partitioning is that at each time step and for a prefix signal , the formulas in are able to predict a label for , based on their satisfaction or violation with respect to the prefix signal. However, the set of formulas in may not be conclusive enough to predict a label and they need more time instances of the prefix signal. It is clear that at and , and at , and .

Alg. 2 takes as input the dataset and the set of STL formulas . The subsets and and the weight vector are initialized by , and the zero vector , respectively (line 3). For each time step along the horizon of signals (line 4), the subsets of formulas and are computed by the function (line 5). This function compares the horizons of the formulas in with the current time step , and partitions them into and . If there is no update in the subset compared to the previous time step (line 6), the same weight vector from the previous time step is used for the current time (line 7). If the set is updated (line 8), first, we construct the prefix dataset (line 9). Then, the robustness of the prefix signals in are computed with respect to the formulas in , by the function, and stored in the robustness matrix (line 10). The dimensions of the are , where the row contains the robustness of prefix signal with respect to the formulas in . The robustness matrix and the labels of the signals are used to learn the weights of the formulas in at time , denoted by the vector , and the weights of the formulas in are considered as zero, denoted by (line 11). The function constructs a Neural Network (NN) to learn the weights of the formulas in . Using NNs to learn the weights of wSTL formulas has been explored previously in [yan2021neural]. Inspired by [cuturi2017soft]

, the differentiable loss function of our designed NN measures the difference between the predicted label of the prefix signal

, and its actual label from the dataset . Finally, the weight vector is transformed to a column vector (line 12), and added to the weight matrix .The output of the ”Classifier Evaluation” is considered as the weighted conjunction of the STL formulas in , denoted by the wSTL formula . The final output of our framework is considered as , which predicts the label of a prefix signal as:

(2) |

Note that our proposed predictor computes the robustness of the prefix signal as the weighted sum of the robustnesses of each STL formula in , which is different from monitoring the wSTL formula and computing its robustness by the methods in [mehdipour2020specifying, yan2021neural]. The time-dependent nature of the weights adjusts the predictor, based on the classification performance of the STL formulas and the time length of the prefix signals. In Sec. 7, we emphasize on the importance of the weight distributions of the STL formulas, by showing the performance of the predictor, with and without considering the weight distributions.

Remark: Note that for a formula with , the robustness of a prefix signal is constant for all . Therefore, whenever there is an update in compared to , the first columns of the are equal to , and the robustness computations are done for the last columns of . This technique improves the efficiency and computational time of our method.

## 7 Case Studies

We demonstrate the usefulness and classification performance of our approach with two case studies. The first one is the urban-driving scenario from Fig. 1, implemented in the simulator CARLA [dosovitskiy2017carla]. The second is the naval surveillance scenario from [kong2016temporal]. We compare our framework with two baseline methods. In the first one, referred by all-times, instead of choosing a finite number of decision times, we generate classifiers at all time points along the horizon of signals. We compare our framework with this baseline method, in the sense of efficiency and classification performance, and show the significance of choosing a finite number of decision times. In the second baseline, referred by uniform-weights

, we consider an uniform distribution for the weights of formulas at all decision times. The main purpose of comparing our framework with uniform-weights baseline is to emphasize on the importance of considering time-variant weight distributions in the predictor function.

We use Particle Swarm Optimization (PSO)

[kennedy1995particle] to solve the optimization problems in Alg. 1. The parameters of the PSO and the NN used in Sec. 6, are tuned empirically. In both scenarios, we evaluate our framework with maximum depth of 2 for the decision trees, and with 3-fold cross validation. The execution times are measured based on the system’s clock. All computations are done in Python 2, on an Ubuntu 18.04 system with an Intel Core i7 @ GHz processor and GB RAM.Remark: In Alg. 2, in the early time steps when none of the formulas in have a horizon less than , the set is empty and the weights of all formulas in are zero. In such time steps, for the evaluation of our framework, we assign a value of to . We make this convention because the predictor is inconclusive and we force it to predict a label based on i.i.d. fair coin tosses.

### 7.1 Urban-driving scenario

Consider the urban-driving scenario from Sec. 1 and depicted in Fig. 1. Ego and the other car are in different, adjacent lanes, moving in the same direction on an uphill road, by applying constant throttles. The throttle of ego is smaller than that of the other car, and the positions of the cars are initialized such that the other car is always ahead of ego. There is a pothole in the middle of the uphill road. We implement this scenario in the simulator CARLA. In our implementation, the cars move uphill in the plane of the coordinate frame, towards positive and directions, with no lateral movements in the direction. The simulation ends whenever ego gets closer than

to its goal point. We assume ego is able to estimate the relative position and velocity of the other car. The dataset of the scenario consists of 300 signals with 477 uniform time-samples per trace. 150 of the signals are for an aggressive driver and 150 are for a safe driver. The signals are 4-dimensional, which are the relative position and velocity of the other car in

and axes.The ”Signal Analysis” part of our framework finds 9 decision times for the dataset, and the wSTL formula from the ”Classifier Evaluation” part is . As an example formula in one the folds, the STL formula learned for the decision time is: , where , , and . For example, states that at some timepoint between 94 to 100, the relative velocity of the other car in y-axis gets less than or equal to 1.12 . The IMCR comparison of our method with the baseline methods, by the same initialization of the parameters, is shown in Fig. 3 (a). The runtime of our framework, the uniform-weights, and the all-times baselines are , , and , respectively. Note that in this scenario, when the other car reaches the pothole, the behavior of safe and aggressive drivers are different. In Fig. 3 (a), although the IMCR of our framework is close to the uniform-weights method before reaching the pothole, the IMCR of our approach shows better classification performance than the uniform-weights baseline after that. For the all-times baseline, its IMCR is generally better than our framework, over the horizon of signals, but its runtime is drastically larger. Moreover, the all-times baseline generates 477 formulas for this scenario and learns their corresponding weight distributions, which requires exponentially larger memory than our framework that only has 9 formulas.

### 7.2 Naval scenario

The naval surveillance problem was proposed in [kong2016temporal], based on the scenarios from [kowalska2012maritime]. The goal of the scenario is to detect the anomalous vessel behaviors from their trajectories. Normal trajectories belong to the vessels that approach from the open sea and head directly toward the harbor. Anomalous trajectories belong to the vessels that either veer to the island and then head to the harbor, or they approach other vessels in the passage between the peninsula and the island and then veer back to the open sea. The dataset of the scenario consists of 2000 signals, with 1000 normal and 1000 anomalous trajectories. The signals are represented as 2-dimensional trajectories with planar coordinates (), and they have 61 timepoints. The labels indicate the type of the vessel’s behavior (normal or anomalous).

The ”Signal Analysis” component of our framework finds 8 decision times (see Fig. 2 (b)), and the final wSTL formula is . As an example formula in one of the folds, the learned STL formula for the decision time is , where , , and . The IMCR comparison of our framework with the two baseline methods, with the same parameter initialization, is shown in Fig. 3 (b). The runtime of our framework, the uniform-weights, and the all-times baselines are , , and , respectively. From Fig. 3 (b), it is clear that the IMCR of our framework is better than the uniform-weights method, all over the horizon of signals. Although the all-times baseline obtains better IMCR than our approach, its runtime and memory consumption are noticeably larger, as it generates 61 STL formulas and learns the corresponding weight distributions.

## 8 Conclusion

In this paper, we considered the problem of predicting the labels of prefix signals over time, given a dataset of labeled signals. Our proposed framework combines temporal logics and neural networks to construct a predictor for classifying the prefix signals. The effectiveness of our method was evaluated in an urban-driving and a naval surveillance scenario. In future work, we will explore advanced signal analysis techniques to find decision times. We will also explore other classification techniques as alternatives to the decision tree method, and evaluate their prediction performances.

Comments

There are no comments yet.