# Weighted Graph-Based Signal Temporal Logic Inference Using Neural Networks

Extracting spatial-temporal knowledge from data is useful in many applications. It is important that the obtained knowledge is human-interpretable and amenable to formal analysis. In this paper, we propose a method that trains neural networks to learn spatial-temporal properties in the form of weighted graph-based signal temporal logic (wGSTL) formulas. For learning wGSTL formulas, we introduce a flexible wGSTL formula structure in which the user's preference can be applied in the inferred wGSTL formulas. In the proposed framework, each neuron of the neural networks corresponds to a subformula in a flexible wGSTL formula structure. We initially train a neural network to learn the wGSTL operators and then train a second neural network to learn the parameters in a flexible wGSTL formula structure. We use a COVID-19 dataset and a rain prediction dataset to evaluate the performance of the proposed framework and algorithms. We compare the performance of the proposed framework with three baseline classification methods including K-nearest neighbors, decision trees, and artificial neural networks. The classification accuracy obtained by the proposed framework is comparable with the baseline classification methods.

## Authors

• 2 publications
• 1 publication
• 2 publications
• 4 publications
• 43 publications
04/08/2021

### Neural Network for Weighted Signal Temporal Logic

In this paper, we propose a neuro-symbolic framework called weighted Sig...
01/20/2020

### A graph-based spatial temporal logic for knowledge representation and automated reasoning in cognitive robots

A new graph-based spatial temporal logic is proposed for knowledge repre...
05/24/2021

### Uncertainty-Aware Signal Temporal Logic Inference

Temporal logic inference is the process of extracting formal description...
04/16/2019

### An Efficient Formula Synthesis Method with Past Signal Temporal Logic

In this work, we propose a novel method to find temporal properties that...
04/15/2022

### Interpretable Fault Diagnosis of Rolling Element Bearings with Temporal Logic Neural Network

Machine learning-based methods have achieved successful applications in ...
03/22/2019

### Graph Temporal Logic Inference for Classification and Identification

Inferring spatial-temporal properties from data is important for many co...
04/11/2016

### Reverse Engineering and Symbolic Knowledge Extraction on Łukasiewicz Fuzzy Logics using Linear Neural Networks

This work describes a methodology to combine logic-based systems and con...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Learning spatial-temporal properties from data is useful in many applications, especially where the data is based on an underlying graph structure. It is preferable that the learned properties are interpretable for humans and amenable to formal analysis [Seshia2016]. Various logics have been introduced to express and analyze spatial-temporal properties [Xu2019][Haghighi2016][Djeumou2020], and temporal logics are one of the major groups. Temporal logics, which are categorized as formal languages [Hopcroft1979IntroductionTA], demonstrate both the interpretability and being amenable to formal analysis; thus, temporal logics are used to express the temporal and logical properties of systems in a human-interpretable way. In addition, graph-based logic (GL), which is used to express spatial properties, is understandable for humans and GL preserves the rigorous aspect of formal logics.

Besides interpretability and being amenable to formal analysis, efficiency and expressiveness [Guhring2020] are also important when learning spatial-temporal properties from data [Liu2020]. One approach to increase the efficiency of the learning task is to integrate neural networks into the process [Seo2017] [Wu2019][Ziat2017]. We can expand the capacity of the learning task to handle more complex spatial-temporal datasets by combining the distinct advantages of temporal logics and graph-based logics, and neural networks.

#### Contributions

In this paper we propose a methodology that trains neural networks to learn spatial-temporal properties from data in the form of weighted graph-based signal temporal logic (w-GSTL) formulas. The contributions of this paper are as follow: (a) we introduce a flexible w-GSTL formula structure that allows the user’s preference to be applied in the inferred w-GSTL formula. In this structure, the w-GSTL operators are free to be inferred; (b) we propose a framework and algorithms to learn w-GSTL formulas from data using neural networks. In the proposed framework and algorithms, neurons of the neural networks represent the quantitative satisfaction of the w-GSTL operators and Boolean connectives. For a given flexible w-GSTL formula structure, we first construct and train a neural network to learn w-GSTL operators through back-propagation; then, we construct and train another neural network to learn the parameters of the flexible w-GSTL formula structure

through back-propagation. We evaluate the performance of the proposed framework and algorithms by exploiting real-life examples: predicting COVID-19 lockdown measure in a geographical region in Italy, and rainfall prediction in a geographical region in Australia. The obtained results show that the proposed method achieves comparable classification accuracy with comparison with three baseline classification methods including K-nearest neighbors (KNN), decision trees (DT), and artificial neural networks (ANN), while the interpretability has been improved with the learned w-GSTL formulas.

### I-a Related Work

Recently, learning spatial-temporal properties from data has been employed in different applications such as traffic speed forecasting [Li2018], swarm robotics [Djeumou2020], etc. Different methods have been adopted to carry out these learning tasks. Many researchers have developed different logics to learn spatial-temporal properties from data. For example, Xu et. al. [Xu2019] introduce graph temporal logic (GTL). Many other researchers propose frameworks to conduct the learning tasks based on neural networks. For instance, Wu et. al. [Wu2019] develop a CNN (convolution neural network)-based method and name it Graph WaveNet. The proposed approach in this paper benefits from advantages of both the formal logics and neural networks: human-interpretability and efficiency.

Moreover, combining temporal logic and neural networks to carry out learning tasks has been gaining attention [Yan2021] [Serafini2016][Riegel2020]

. One way to realize this combination is through connecting the temporal operators and Boolean connectives to the activation functions in neural networks

[Riegel2020]

. Most of the standard algorithms used to conduct logic inference solve a non-convex optimization problem to find parameters in the formula, where the loss function of a neural network is not differentiable with respect to the parameters at every point. In

[Yan2021], Yan et. al. propose a loss function that addresses the differentiability issue.

## Ii Preliminaries

#### Graph

We denote a graph by , where is a finite set of nodes, is a finite set of edges, and . We also denote a set of (possibly infinite) node values by , where .

#### Graph-based trajectory

We define a finite -dimensional graph-based trajectory that assigns a node value for each node at time-step , where is a finite discrete time domain and . We also denote the value of the -th dimension of the graph-based trajectory at time-step and node by . A time interval is denoted by , and denotes the time interval .

## Iii Weighted Graph-Based Signal Temporal Logic

In this section, we introduce weighted graph-based signal temporal logic (w-GSTL) as the weighted extension of graph-based logic (GL) which is modified from graph temporal logic in [Xu2019].

### Iii-a Graph-Based Logic

In this subsection, we define the syntax and Boolean semantics of graph-based logic (GL) formulas. For GL formulas, we define to denote the set of neighbors of a node , where the subscript N stands for “neighbor”. The number of the neighbors of the node is denoted by . We define the syntax of GL formulas as follows.

 ϕG:=⊤|π|¬ϕG|ϕ1G∧ϕ2G|∀G◯NϕG|∃G◯NϕG, (1)

where stands for the Boolean constant , is an atomic predicate in the form of an inequality in the form , , and ; (negation) and (conjunction) are standard Boolean connectives; is a GL operator called graph-based universal quantifier and reads as “all the neighbors of the current node satisfy ”; is a GL operator called graph-based existential quantifier and reads as “there exists at least one neighbor of the current node that satisfies ”. We define the Boolean semantics of GL formulas as follows.

 (\graphTraj,v,k)⊨π iff f(\graphTraj(v,k))>0 (\graphTraj,v,k)⊨¬ϕG iff (\graphTraj,v,k)⊭ϕG, (\graphTraj,v,k)⊨ϕ1G∧ϕ2G iff (\graphTraj,v,k)⊨ϕ1G and (\graphTraj,v,k)⊨ϕ2G, (\graphTraj,v,k)⊨∃G◯NϕG iff ∃^v∈◯Nv s.t. (\graphTraj,^v,k)⊨ϕG, (\graphTraj,v,k)⊨∀G◯NϕG iff ∀^v∈◯Nv s.t. (\graphTraj,^v,k)⊨ϕG.

The quantitative satisfaction of graph-based logic formulas at node and at time-step is defined as follows.

 r(\graphTraj,v,π,k)=f(\graphTraj(v,k)),r(\graphTraj,v,¬ϕG,k)=−r(\graphTraj,v,ϕG,k),r(\graphTraj,v,ϕ1G∧ϕ2G,k)=min(r(\graphTraj,v,ϕ1G,k),r(\graphTraj,v,ϕ2G,k)),r(\graphTraj,v,∀G◯NϕG,k)=min^v∈◯Nvr(\graphTraj,^v,ϕG,k),r(\graphTraj,v,∃G◯NϕG,k)=max^v∈◯Nvr(\graphTraj,^v,ϕG,k).

### Iii-B Graph-Based Signal Temporal Logic

Signal temporal logic (STL) is one type of temporal logics, which deals with real-valued data over real-time domain [Donze]. STL is used in learning temporal properties from data [Kong2017]. In this paper, we extend STL to graph-based signal temporal logic (GSTL) formulas to express the temporal properties that are related to a given node or a set of nodes in a graph . The syntax of GSTL formula is defined recursively as follows.

 ϕ :=ϕG∣¬ϕ∣ϕ1∧ϕ2∣GIϕ∣FIϕ,

where is a GL formula, (negation) and (conjunction) are standard Boolean connectives, is the temporal operator “always”, and is the temporal operator “eventually”. The Boolean semantics of GSTL is based on the Boolean semantics of STL [Donze] and is evaluated using graph-based trajectories. The Boolean semantics of is as described in Subsection III-A.

###### Example 1.

In Figure 1, , and graph-based trajectory satisfies the GL formula only at node . For the time interval , if the node value of node stays greater than 2 in the time interval , then the GSTL formula is satisfied by graph-based trajectory at node .

### Iii-C Weighted Graph-Based Signal Temporal Logic

An extension of STL is weighted STL (wSTL), where we assign a weight to each subformula of an wSTL formula based on its importance [Mehdipour2021][Yan2021]. We refer to these weights as importance weights. In this paper, we extend wSTL to weighted GSTL (w-GSTL). In w-GSTL, in addition to defining importance weights for the subformulas, we define the importance weights for both the temporal operators and the GL operators. In other words, we assign an importance weight to each time-step , and we assign an importance weight to each neighbor node of a node . We define the syntax of w-GSTL formulas as follows.

 ~ϕ:=\GwSTL[\weight]∣¬~ϕ∣\wSTLstyle[w1][1]∧\wSTLstyle[w2][2]∣\walways[Ω][I]~ϕ∣\wfinally[Ω][I]~ϕ,

where and are positive importance weights on and , respectively; assigns a positive weight to in the temporal operators; assigns a positive weight to each . In the syntax of w-GSTL formulas, is defined as follows.

 \GwSTL[\weight]:=⊤∣π∣¬\GwSTL[\weight]∣\wSTLstyle[w1][1][G]∧\wSTLstyle[w2][2][G]∣∀G◯WN\GwSTL ∣∃G◯WN\GwSTL

## Iv Weighted Graph-Based Signal Temporal Logic and Neural Networks

In this section, we formalize the problem of classifying graph-based trajectories by inferring w-GSTL formulas using neural networks. We denote a set of labeled graph-based trajectories by

, and a time horizon by (where ). We assume that the set is composed of two subsets: positive subset which contains the graph-based trajectories representing the desired behavior, and negative subset which contains the graph-based trajectories representing the undesired behavior. For the cardinality of the defined sets, we have , , and . Inspired by [Kong2017], we define the following.

###### Definition 1.

We define a w-GSTL formula structure, denoted by , as a w-GSTL formula in which the importance weights of the subformulas and the variables of the atomic predicates, and the importance weights of the GL operators and the temporal operators are replaced by free parameters. In this structure, we assume that we always have at least one temporal operator and one GL operator.

###### Definition 2.

We define a flexible w-GSTL formula structure as a flexible extension of w-GSTL formula structure such that the types of the temporal operators and the types of the GL operators are to be inferred from data; but the types of the Boolean connectives in the structure are fixed. In this structure, we represent the set of GL operators as a set from which the proper operator is to be inferred from data. Similarly, we represent the set of temporal operators as a set .

###### Example 2.

In the flexible w-GSTL formula structure , the types of the temporal operators and the GL operators, the importance weights of the subformulas and the variables of the atomic predicates and , and the importance weights of the GL operators and the temporal operators are to be inferred from data, but the Boolean connective is fixed.

After determining the proper w-GSTL operators in a given , we obtain a w-GSTL formula that is consistent with .

###### Problem 1.

Given a set of labeled graph-based trajectories and a flexible w-GSTL formula structure , infer a w-GSTL formula (i.e., select the w-GSTL operators and compute the parameters of that structure) to classify such that the classification accuracy is maximized.

In order to solve Problem 1, we introduce w-GSTL neural networks (w-GSTL-NN) which combines the characteristics of w-GSTL and neural networks. In the first step, we construct and train a w-GSTL-NN to learn the proper w-GSTL operators in the flexible w-GSTL formula structure . In the second step, we construct and train another w-GSTL-NN to learn the parameters in . In w-GSTL-NN, we combine the activation functions in a neural network with the quantitative satisfaction of w-GSTL. We define the quantitative satisfaction of a w-GSTL formula as follows [Yan2021].

 rw(\graphTraj,v,π,k)=f(\graphTraj(v,k)),rw(\graphTraj,v,¬~ϕ,k)=−rw(\graphTraj,v,~ϕ,k),rw(\graphTraj,v,\wSTLstyle[w1][1]∧\wSTLstyle[w2][2],k)=⊗∧(\intervalStyle\weight[i]rw(\graphTraj,v,~ϕi,k)i=1,2),rw(\graphTraj,v,\wSTLstyle[w1][1]∨\wSTLstyle[w2][2],k)=⊗∨(\intervalStyle\weight[i]rw(\graphTraj,v,~ϕi,k)i=1,2),rw(\graphTraj,v,\walways[Ω][I]~ϕ,k)=⊗G(Ω,{rw(\graphTraj,v,~ϕ,i)}i∈k+I),rw(\graphTraj,v,\wfinally[Ω][I]~ϕ,k)=⊗F(Ω,{rw(\graphTraj,v,~ϕ,i)}i∈k+I),rw(\graphTraj,v,∀G◯WN\GwSTL,k)=⊗∀G(W,{rw(\graphTraj,^v,\GwSTL,k)                      }^v∈◯Nv),rw(\graphTraj,v,∃G◯WN\GwSTL,k)=⊗∃G(W,{rw(\graphTraj,^v,\GwSTL,k)                      }^v∈◯Nv),

where , , and are activation functions corresponding to , , , , , and operators, respectively. We denote an activation function with , where .

For defining the activation functions, we use the variable , where is a positive real number. We define the activation function corresponding to each operator in the set as follows.

 (2)

where is the normalized weight and ; in the activation functions of and Boolean connectives, ; In the activation functions of and operators, ; in and graph operators, ; in the activation function of operator, ; in the activation function of operator, ; in the activation function of operator, ; in the activation function of operator, ; in the activation function of operator, ; and in the activation function of operator, ; in the activation functions of and operators, ; in the activation functions of and operators, ; in the activation functions of and operators, [Yan2021].

## V Methodology

In this section, we introduce a framework and algorithms to solve Problem 1 for a given flexible w-GSTL formula structure with undetermined temporal operators and undetermined GL operators using w-GSTL-NN. For the training of w-GSTL-NN, we design a loss function to meet the following two requirements: 1) the loss should be small when the inferred formula is satisfied by graph-based trajectories in and violated by the graph-based trajectories in ; 2) the loss should be large when the inferred formula is not satisfied by the graph-based trajectories in and not violated by the graph-based trajectories in . We define the loss function as follows [Yan2021].

 J(~ϕ)=N\Sample∑i=1e−ηi]rw(\graphTraj[i],~ϕ), (3)

where denotes the -th graph-based trajectory in , and is a tuning parameter. is small for the cases where and increases exponentially when .

We compute a w-GSTL formula in two steps: 1) determining the proper temporal and GL operators in the given flexible w-GSTL formula structure ; 2) learning the parameters of the flexible w-GSTL formula structure . For each step, we construct and train a separate w-GSTL-NN.

Algorithm 1 illustrates the two-step procedure of learning a w-GSTL formula from a given set of labeled graph-based trajectories and a given flexible w-GSTL formula structure .

#### Step 1

In step 1, we initialize two sets of coefficients: (a) corresponding to undetermined temporal operators; (b) corresponding to undetermined GL operators, where (Line 3 in Alg. 1). In this step, for the activation functions of and , we define the coefficients and , respectively. Similarly, for the activation functions of and , we define the coefficients , respectively. We construct and train a w-GSTL-NN (demonstrated in Alg. 2) to learn and (Line 4 in Alg. 1). For each undetermined temporal operator, the sign of the returned determines the proper selection from the set . In other words, if , then we have . This means the proper selection for the temporal operator corresponding to is . If , then is the proper selection from the set (Line 5 to 8 in Alg. 1). Similarly, for each undetermined GL operator, the sign of the returned determines the proper selection of the undetermined GL operator from the set (Lines 9 to 12 in Alg. 1).

#### Step 2

After determining the proper operators, we construct and train another w-GSTL-NN (demonstrated in Alg. 2) to learn parameters of the flexible w-GSTL formula structure including , , and all the importance weights in the flexible w-GSTL formula structure that we denote them by (Lines 13 to 16 in Alg. 2).

Algorithm 2 illustrates w-GSTL-NN that we use to learn w-GSTL formulas. In Algorithm 2, denotes a set of parameters that we calculate at each step of the two-step process of learning a w-GSTL formula. In step 1, includes and . In step 2, includes the parameters of the flexible w-GSTL formula structure (including , , and ). In Algorithm 2, we denote the forward-propagation operation by , where denotes the selected mini-batch from the set , is the th subformula in , is activation function corresponding to the w-GSTL operator or Boolean connective in , is to be calculated, and is the output of forward-propagation (Line 5 in Alg. 2). More clearly, after determining the w-GSTL operators, w-GSTL-NN calculates the quantitative satisfaction of the inferred with respect to through forward-propagation. Also, We denote the back-propagation operation by , where is to be updated (Line 8 in Alg. 2). The proposed algorithms are implemented in a Python toolbox.

## Vi Case Studies

In this section, we assess the performance of w-GSTL-NN. We first use a meteorological dataset in Australia to predict rainfall. Then, we predict the severity of lockdown measures using COVID-19 data in Italy. The performance of w-GSTL-NN is compared with some other standard classification algorithms. The flexible w-GSTL formula structure that we use for these two case studies is .

### Vi-a Case Study I: Rain Prediction

In this subsection, we use w-GSTL-NN to predict rainfall in regions of Australia. The dataset is acquired from the Australian Government Bureau of Meteorology[Young2017]. The dataset that we use is composed of weather-related data in 49 regions of Australia measured daily from March 1st, 2013 to June 25th, 2017, including minimum temperature (), maximum temperature (), amount of rainfall (), evaporation (), sunshine (), wind gust speed (), wind speed at 9am (), wind speed at 3pm (), humidity at 9am (), humidity at 3pm (), pressure at 9am (), pressure at 3pm (), cloud at 9am (), cloud at 3pm (), temperature at 9am (), temperature at 3pm (), and whether there was any rain on the given day (-1 or 1) (

). The criterion for the binary classification is the rain prediction for the following day (-1 for no rain, 1 for rain). We construct a graph structure of the Australian regions, considering any region within a 300 km radius of another region to be neighbors. We utilize the zero imputation method for missing values in the input data. The dataset is divided into a proportion of 0.8:0.2 for training dataset and testing dataset, respectively, resulting in 1262 total data points per region for training dataset and 316 total data points per region for the testing dataset.

For demonstration, we consider Albury and its neighbors: Wagga Wagga, Canberra, Tuggeranong, Mount Ginini, Bendigo, Sale, Melbourne Airport, Melbourne, and Watsonia. We set consecutive days of data as one instance of dataset in our experiment. The dataset is passed through two separate neural networks. In the first step, we determine the temporal operators and the GL operators to be applied on and intervals, respectively. In the second step, we learn the parameters of the flexible w-GSTL formula structure . In the experiment, we set = 15, = [0, 6], = [7, 14], = 1, and = 1. The learned w-GSTL formula is as follows.

 ~ϕ=\textsuperscript\weight[1](∃G◯NW(\walways[Ω1][[0,6]]π))∨(¬\textsuperscript\weight[2](∃G◯WN(\wfinally[Ω2][[7,14]]π))), (4)

where the inferred predicate is as follows,

 π:=(0.0026\graphTraj1−0.0040\graphTraj2+0.0035\graphTraj3+0.0057\graphTraj4−0.0298\graphTraj5+0.0080\graphTraj6+0.0055\graphTraj7−0.0015\graphTraj8+0.0013\graphTraj9+0.0102\graphTraj10−0.0003\graphTraj11−0.0006\graphTraj12+0.0226\graphTraj13+0.0222\graphTraj14−0.0007\graphTraj15−0.0056\graphTraj16+0.0309\graphTraj17≤0.6593), (5)

the temporal weights for are as follows,333The temporal weights for was omitted due to space limitation.

 Ω1=[0.1087,0.2210,0.0655,0.1927,0.0163,0.1349,0.2609], (6)

the normalized spatial weights for Wagga Wagga, Canberra, Tuggeranong, Mount Ginini, Bendigo, Sale, Melbourne Airport, Melbourne, and Watsonia, respectively, are as follows,

 W=[0.0443,0.1439,0.0319,0.1930,0.1299,0.0000,0.1984,0.1719,0.0867], (7)

and the normalized weights for the operator are: = 0.6891, = 0.3109.

We evaluate the performance of the proposed algorithm by applying some standard classification methods such as K-nearest neighbors (KNN) and decision tree (DT), and an artificial neural network (ANN) algorithm on the same dataset.

w-GSTL-NN produces a higher accuracy than both KNN and DT (Table I). Although the accuracy of ANN is higher than w-GSTL-NN, w-GSTL-NN produces more human-interpretable results than ANN with the use of temporal and spatial logic and its ability to learn the w-GSTL operators in the flexible w-GSTL formula structure . For instance, the coefficients associated with the predicates suggest that larger amount of clouds at 9am (), larger amount of clouds at 3pm(), and more rainfall during the interval (

) in the input data correlates to a larger probability that there will be rainfall on the target date, while the larger amount of sunshine (

) correlates to a larger probability that there will not be rainfall on the target date. Furthermore, the spatial weights suggest that the data from Mount Ginini and Melbourne Airport have the most influence out of the neighbors when classifying rainfall in Albury.

### Vi-B Case Study II: Classifying COVID-19 Lockdown Measures

In this subsection, we use simulated COVID-19 datasets of Italian regions from the DiBernardo Group Repository[Rossa2020]. The dataset is composed of a time-series dataset for each region in Italy. The inputs of each time series are percentage of people infected (), quarantined (), deceased (), and hospitalized due to COVID-19 (). The data is simulated for the social distancing parameter of 0.3 for strict lockdown measures and 1 for no lockdown measures. Each of the inputs of the data was recorded daily for 365 days. We turn this case study into a binary classification by labeling “strict lockdown measures” with -1 and “no lockdown measures” with 1. For this case study, the regions that we consider are Abruzzo and its neighboring regions: Lazio, Marche, and Molise.

We use the flexible w-GSTL formula structure and set = 30, = [0, 14], = [15, 29], = 1, and = 1. We divide the dataset into 472 sets of time instances for training dataset and 200 sets of time instances for testing dataset. The w-GSTL formula with learned operators is as follows.

 ~ϕ=\textsuperscript\weight[1](∃G◯WN(\walways[Ω1][[0,14]]π))∨(¬\textsuperscript\weight[2](∃G◯WN(\wfinally[Ω2][[15,29]]π))), (8)

where the inferred predicate is as follows,

 π:=(−4.4617\graphTraj1−4.3504\graphTraj2−3.2291\graphTraj3−4.4045\graphTraj4≤−0.1421), (9)

the normalized temporal weights for are as follows,

 Ω2=[0.0006,0.0007,0.0008,0.0008,0.0007,0.0008,0.0007,0.0007,0.0007,0.0006,0.0007,0.0008,0.0007,0.1663,0.8064], (10)

the normalized spatial weights for Lazio, Marche, and Molise, respectively, are , and the normalized weights for the operator are: = 0.9998, = 0.0002. Standard classification algorithms including KNN and DT are applied to this dataset to conduct the binary classification. The learned w-GSTL formula provides us with the spatial-temporal properties of the dataset, and determines whether there is a strict lockdown measure in the region or not. Furthermore, the accuracy of w-GSTL-NN matches that of K-nearest neighbors and decision tree for the COVID-19 dataset, all achieving an accuracy of 100%.

## Vii Conclusion

In this paper, we proposed a framework that combined neural networks and w-GSTL for learning spatial-temporal properties from data. The proposed approach represents the learned knowledge in a human-readable form. As the future direction, we plan to extend this approach to scenarios where only the positive data is available. Also, we aim to apply w-GSTL-NN in the settings of deep reinforcement learning (deep RL) to improve the interpretability of the deep RL, where we deal with graph-structured problems.