# Census Signal Temporal Logic Inference for Multi-Agent Group Behavior Analysis

In this paper, we define a novel census signal temporal logic (CensusSTL) that focuses on the number of agents in different subsets of a group that complete a certain task specified by the signal temporal logic (STL). CensusSTL consists of an "inner logic" STL formula and an "outer logic" STL formula. We present a new inference algorithm to infer CensusSTL formulae from the trajectory data of a group of agents. We first identify the "inner logic" STL formula and then infer the subgroups based on whether the agents' behaviors satisfy the "inner logic" formula at each time point. We use two different approaches to infer the subgroups based on similarity and complementarity, respectively. The "outer logic" CensusSTL formula is inferred from the census trajectories of different subgroups. We apply the algorithm in analyzing data from a soccer match by inferring the CensusSTL formula for different subgroups of a soccer team.

## Authors

• 17 publications
• 2 publications
• ### About the unification type of simple symmetric modal logics

The unification problem in a normal modal logic is to determine, given a...
02/11/2019 ∙ by Philippe Balbiani, et al. ∙ 0

• ### Graph Temporal Logic Inference for Classification and Identification

Inferring spatial-temporal properties from data is important for many co...
03/22/2019 ∙ by Zhe Xu, et al. ∙ 0

• ### Kontrol Edilebilir ptSTL Formulu Sentezi – Synthesis of Controllable ptSTL Formulas

In this work, we develop an approach to anomaly detection and prevention...
03/22/2020 ∙ by Irmak Saglam, et al. ∙ 0

• ### Information-Guided Temporal Logic Inference with Prior Knowledge

This paper investigates the problem of inferring knowledge from data so ...
11/21/2018 ∙ by Zhe Xu, et al. ∙ 0

• ### Signal Convolution Logic

We introduce a new logic called Signal Convolution Logic (SCL) that comb...
06/01/2018 ∙ by Simone Silvetti, et al. ∙ 0

• ### Monitoring of Traffic Manoeuvres with Imprecise Information

In monitoring, we algorithmically check if a single behavior satisfies a...
09/08/2017 ∙ by Heinrich Ody, et al. ∙ 0

• ### The Planning Spectrum - One, Two, Three, Infinity

Linear Temporal Logic (LTL) is widely used for defining conditions on th...
09/28/2011 ∙ by M. Pistore, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In some multi-agent systems, there are subgroups that perform different tasks, such as the defenders, midfielders and forwards in a soccer team [1]. Within each subgroup, the agents can be seen as interchangeable members in the sense that as long as there is a certain number of agents in the subgroup performing the task, it does not matter who these agents are. In the social network context, the behavior pattern of different groups of people and the temporal influence on them have been a research focus [2], [3], [4]. A recommender system can use this information to give better recommendations of the place and time for doing certain activities whether it is shopping or checking in at a hotel. In robotics, the Multi-Agent Robot Systems (MARS) [5], [6], [7], [8] are being studied for their co-operative behaviors such as the leader robot tracking a prescribed trajectory and the rest of the robots following the leader while forming a desired formation pattern [9]. In all of these applications, how to express and characterize the properties of the group behavior has always been a challenge.

### 1.1 Related Works

There has been rich literature on formalization of multi-agent group behaviors. In [10], the authors propose an ontology-based behavior modeling and checking system to explicitly represent and verify complex group behavior interactions. Temporal logic is a formal approach that has been increasingly used in expressing more complicated and precise high-level control specifications [11], [12]. There has been many different temporal logic frameworks in multi-agent systems to guarantee safe and satisfactory performance from high level perspectives, such as LTL [13], [14], CTL [15], ATL [Alur2002ATL], etc. The temporal logic formulae are predefined as a specification for the behaviors of the system [16], [17].

Recently, there is a growing interest in devising algorithms to identify dense-time temporal logic formulae from system trajectories [18]. In [19], the authors present a method to synthesize magnitude and timing parameters in a quantitative temporal logic formula so that it fits observed data. In [20]

, the authors designed an inference algorithm that can automatically construct signal temporal logic formulae directly from data. The obtained signal temporal logic formulae can be used to classify different behaviors, predict future behaviors and detect anomaly behaviors

[21].

In this paper, we define a novel census signal temporal logic (CensusSTL) that focuses on the number of agents and the structure of the group that complete a certain task specified by the signal temporal logic. The word “census ”means “the procedure of systematically acquiring and recording information about the members of a given population[22]. In the group behavior analysis, we need to generate knowledge about the behaviors of the members or agents of different subgroups, and census signal temporal logic provides a formal structure for generating such knowledge. The census signal temporal logic formula is essentially a signal temporal logic formula (“outer logic”) with the variable in the predicate being the number of agents whose behaviors satisfy another signal temporal logic formula (“inner logic”). For example, the census signal temporal logic formula can express specifications such as “From 10am to 2pm, at least 3 policemen should be present at the lobby for at least 20 minutes in every hour, where the “inner logic” formula is the task “be present at the lobby for at least 20 minutes in every hour”and the “outer logic” formula is “from 10am to 2pm, at least 3 policemen should perform the task”.

CensusSTL is different from the other Temporal Logic frameworks for multi-agent systems as it does not focus on individual agents or the interaction between different agents, but on the number of agents in different subgroups that complete a certain task. Therefore, it is more useful in applications where only the number of agents or the proportion of agents in different subgroups of a population is of interest while different agents in a subgroup can be seen as interchangeable.

We present a new inference algorithm that can infer the CensusSTL formula directly from individual agent trajectories. Our inference method for the “inner logic” formula and the “outer logic” formula are similar to [19] as we also choose the template of formula first and then search for the parameters. However we formulate the problem as a group behavior analysis problem, so the objective of our approach is not only finding the parameters that fit certain temporal logic formula, but also infer the subgroups and the temporal relationship among the different subgroups.

### 1.3 Organizations

This paper is structured as follows. Section II introduces the framework of census signal temporal logic. Section III shows the algorithm to infer the census temporal logic formula from data. Section IV implements the algorithm on analysing a soccer match as a case study. Finally, some conclusions are presented in Section V.

## 2 Census Signal Temporal Logic

### 2.1 “Inner Logic” Signal Temporal Logic

In this paper, we find subgroups of a population that act collaboratively for a task. We need to find both the task and the subgroups from the time-stamped trajectories (for definition of time-stamped trajectories, see the beginning of Section II-B) of different agents. The task can be formulated as an “inner logic” STL formula. Assume there is a group of agents and each agent has an observation space . For example, the group can be a set of points moving in 2D plane, and the observation space can be their 2D positions. Each element of the observation space is described by a set of

variables that can be written as a vector

. The domain of is denoted by . The domain {true, false is the Boolean domain and the time set is (note that we allow negative time to add more flexibility of the temporal operator). With a slight abuse of notation, we define observation trajectory (or signal or behavior) describing the observation of each agent as a function from to . Therefore, refers to both the name of the -th observation variable and its valuation in . A finite set is a set of atomic predicates, each mapping to . The “inner logic” is signal temporal logic [23] and the syntax of the “inner logic” STL proposition can be defined recursively as follows:

 ϕ:=⊤∣μ∣¬ϕ∣ϕ1∧ϕ2∣ϕ1∨ϕ2∣ϕ1UIϕ2

where stands for the Boolean constant true, is an atomic predicate in the form of an inequality where is some real-valued function, (negation), (conjunction), (disjunction) are standard Boolean connectives, is a temporal operator representing “until”, is an interval of the form or . In general, a predicate can be an atomic predicate or atomic predicates connected with standard Boolean connectives. We can also derive two useful temporal operators from “until”(), which are “eventually” and “always”.

We use to represent the observation trajectory at time , then the Boolean semantics of “inner logic” are defined recursively as follows:

 (x,t)⊨μifff(x(t))>0(x,t)⊨¬ϕiff(x,t)⊭ϕ(x,t)⊨ϕ1∧ϕ2iff(x,t)⊨ϕ1and(x,t)⊨ϕ2(x,t)⊨ϕ1∨ϕ2iff(x,t)⊨ϕ1or(x,t)⊨ϕ2(x,t)⊨ϕ1U[a,b)ϕ2iff∃t′∈[t+a,t+b)s.t.(x,t′)⊨ϕ2,(x,t′′)⊨ϕ1∀t′′∈[t+a,t′)

The robustness degree of an observation trajectory with respect to an “inner logic” formula at time is given as , where can be calculated recursively via the quantitative semantics [23]:

 r(x,μ,t)=f(x(t)),r(x,¬ϕ,t)=−(r(x,ϕ,t)),r(x,ϕ1∧ϕ2,t)=min(r(x,ϕ1,t),r(x,ϕ2,t)),r(x,ϕ1∨ϕ2,t)=max(r(x,ϕ1,t),r(x,ϕ2,t)),r(x,□[τ1,τ2)ϕ,t)=mint+τ1≤t′
 r(x,ϕ1U[a,b)ϕ2,t)=supt+a≤t′

### 2.2 Signal Temporal Logic Applied to Data

We make two deviations to STL when applying an “inner logic” formula to data:
1) As the observation trajectory is usually of finite length, and also considering that there may be negative time in the temporal operator of the “inner logic” formula , the satisfaction of the “inner logic” formula may not be well-defined at every time point of the observation trajectory (for example for the formula and , if the observation trajectory is defined on the time domain of , then can only be evaluated on the time domain of [0, 190] and can only be evaluated on the time domain of [10, 200]). Assume that the time domain of the observation trajectory is , with a slight abuse of notation, we define time-stamped trajectory of finite length as a function from to .

The time domain of the “inner logic” formula with respect to is defined recursively as follows:

For example, for “inner logic” formula , if the observation trajectory is defined on , then =[0-0, 200-10][0-20-20, 200-40-60]=[0, 100].

2) The observation data are usually discrete, so the time domain of the observation trajectory is a set of discrete time points. In this case, the interval in the form or actually means the time points in that belongs to . For example, is interpreted as .

Consider the example shown in Fig. 1 where there are two predicates and corresponding to region 1 and region 2 (for the representation of predicates, see Eq. (4) in Section III) and 8 different agents (people) who are moving furnitures from Region 1 to Region 2. The people need to move back and forth frequently between Region 1 and Region 2. One STL formula that can characterize the people moving pattern is

 ϕt= □[0,τ1)pregion1∧◊[τ2,τ3)(pregion2∧◊[τ4,τ5)pregion1) (1)

which reads “the person is in region 1 for time units and arrives in region 2 sometime between and time units, then sometime between and time units later the person comes back to region 1”. The temporal parameters satisfy , , .

We specify that the “inner logic” formula while the temporal operator is to make true at every time point during the execution of the task. Without this temporal operator, is only true at the beginning of the task.

### 2.3 “Outer logic” Census Signal Temporal Logic

Based on the “inner logic”, we can define the “outer logic” census signal temporal logic (CensusSTL). The observation element of the “outer logic” is the number of agents that satisfy the “inner logic” formula, which can be described by non-negative integers that belong to the domain . A census trajectory describes the number of agents in the group whose behaviors satisfy the “inner logic” formula over time.

It should be noted that the time domain of the census trajectories is the same as the time domain of the “inner logic” formula with respect to observation trajectory if the observation trajectory is of finite length, so is a mapping from to . As there may be different subgroups in group , we have the following definition.

###### Definition 1.

We define as the number of agents in the subgroup whose behaviors satisfy the “inner logic” formula at time , or in other words, the number of agents whose behavior (observation trajectory) has positive robustness degree with respect to at time .

With that notation, the atomic predicate of the “outer logic” CensusSTL can be defined as follows,

 μn:=n(ϕ,Si)>c∣−n(ϕ,Si)>−c (2)

where is a non-negative integer.

In the furniture moving example, assume that there are two subgroups of people who are moving the furnitures, and . The atomic predicate of the “outer logic” can express properties such as “the number of people in the subgroup who are moving the furnitures is less than 2”, or “the number of people in the subgroup who are moving the furnitures is more than 3”.

The syntax of the “outer logic” CensusSTL proposition can be defined recursively as follows:

 γ:=⊤∣μn∣¬γ∣γ1∧γ2∣γ1∨γ2∣γ1UIγ2 (3)

As the “outer logic” CensusSTL is also STL, so the semantics of STL also applies to the “outer logic” CensusSTL. The robustness degree of a census trajectory with respect to a CensusSTL formula at time is denoted as , where can be calculated recursively in the same way as is calculated.

## 3 Census Signal Temporal Logic Inference

In this section, we seek to infer the CensusSTL formula describing the behaviors of a group of agents from the collection of the individual agent observation trajectories in a training data set and then test the validity of the inferred CensusSTL formula in a separate validation data set. We choose to represent the predicates as polyhedral sets as they are more general than rectangular sets and computationally easier to handle than other more complex sets (ellipsoidal sets, non-convex sets, etc.). So each predicate in the “inner logic” formula is represented in the following form:

 p:=(m⋀k=1aTkx>bk),ak∈Rn,bk∈R, (4)

where vector and number denote the parameters that define the predicate, is the number of atomic predicates in the predicate.

According to the quantitative semantics of STL, the robustness of each predicate can be expressed as the minimum of robustness of each atomic predicate:

 r(x,p,t)=min1≤k≤m(aTkx−bk), ak∈Rn, bk∈R. (5)

In this paper, we infer the “inner logic” STL formula in the form of where is the formula that describes the task with all the temporal parameters of chosen in and is the necessary length associated with formula as defined below:

 ∥ϕt∥:=min{T∣if To=[0,T],D(ϕt,x)≠∅}

Take STL formula for example, the necessary length ， and is true at every time point during the execution of the task. We consider 4 templates of temporal logic formula corresponding to 4 common tasks:

 ϕt=□[0,τ1)ϕt1∧◊[τ21,τ22)□[0,τ23)ϕt2∧⋯∧◊[τz1,τz2)□[0,τz3)ϕtz (6)

where are subtasks that can be predicates or STL formulae as and the temporal parameters satisfy , . For any term , if , then this term shrinks to ; if , then this term shrinks to . The sequential task is a series of subtasks that are performed in a sequential order.

 ϕt=□[0,τ1)(ϕt1∨ϕt2⋯∨ϕtz) (7)

This concurrent task means “during the next time units, the agent performs at least one of the subtasks ”.

 ϕt=□[0,τ1)◊[0,τ2)ϕt1 (8)

This persistent task means “during the next time units, is performed at least once in every time units”.

 ϕt=□[0,τ1)(ϕt1⇒ϕt2) (9)

where is the cause formula and is the effect formula. This causal task means “during the next time units, whenever the subtask is performed, the agent will perform subtask ”.

In all of these task templates, we set an upper limit to the necessary length associated with formula as we only consider tasks that is finished within certain time. For example, if it generally takes no more than 10 time units to move the furniture from Region 1 to Region 2, then we set .

In the following, we introduce the specific steps to infer the CensusSTL formula from data. Note that our procedure cannot produce a formula that does not conform with the predetermined templates. Our aim is to find the CensusSTL formula that best fits (according to some measure of fitness) a given finite set of observation trajectories. Generally, we are given a training data set of different observation trajectories for each agent, where the time domains of the observation trajectories are not necessarily the same.

### 3.2 “Inner Logic”STL Formula Inference

In this section, we discuss the three requirements the “inner logic” formula needs to meet and then formulate the optimization problem for the “inner logic” formula inference.

#### 3.2.1 Consistency

We heuristically postulate that if the number of agents whose behaviors satisfy the “

inner logic” formula is changing drastically through time, then the formula cannot reflect a task that a group of agents are performing consistently.

###### Definition 2.

We define as the temporal variation of the number of agents in the set whose -th observation trajectories satisfy the “inner logic” formula , which can be described as follows:

 vq(ϕ,S)=1lϕ,q−1lϕ,q−1∑j=1|nq(ϕ,S,j+1)−nq(ϕ,S,j)| (10)

where is the number of agents in the set whose -th observation trajectories satisfy the STL formula at the -th time point, is the number of time points in the time domain of the -th census trajectory.

#### 3.2.2 Frequency

We postulate that if the number of time points at which the behavior of any agent satisfies the “inner logic” formula is small, then the formula cannot reflect a task that is performed frequently.

###### Definition 3.

We define as the total number of time points at which the “inner logic” formula is true for agent in the training data set (the time points in different observation trajectories are counted separately).

#### 3.2.3 Specificity

Sometimes a consistent and frequent task can be overly general or meaningless. For example, the proposition “the agent is always in the entire space ”is always true but does not contain any useful information. To make the task more specific and meaningful, we incorporate some a priori knowledge about the system. The other purpose of incorporating a priori knowledge is to make the task more tailored to the user preferences. For example, if the user is particularly interested in the behavior in a certain region, then this region can be specified as an a priori predicate. Suppose that we are given a priori predicates , we make the obtained predicates as similar as possible to the a priori predicates . The Hausdorff distance is an important tool to measure the similarity between two sets of points [24]. It is defined as the largest distance from any point in one of the sets, to the closest point in the other set. Suppose that the set of states that satisfy the predicate is . Then the Hausdorff distance is expressed as follows

 dH(O(Xi),O(pi))=max{supx∈O(Xi)infy∈O(pi)d(x,y),supy∈O(pi)infx∈O(Xi)d(x,y)} (11)

The expression when both and are convex polyhedra can be evaluated as follows:

Step 1: Calculate all vertices of the polyhedron . Denote them as .

Step 2: Calculate the distance from to for each . This is a convex quadratic optimization problem.

Step 3: Find the maximum of the distances calculated in Step 2.

We denote all parameters that define the “inner logic” STL formula as . Take the case of for example. As and are essentially the same, we constraint to be . One simple way to remove this constraint is to represent using trigonometric parameters . Then can be represented as by utilizing the fact that For the formula above, are the elements of . The lower bound and upper bound of the angles are set to be .

To summarize the three requirements, the inference of the “inner logic” formula where conforms to 1 of the 4 task templates is a constrained multi-objective problem, i.e.
Objectives:
min (consistency)
max (frequency)
min (specificity)
Subject to:

where is the optimization variable, is the upper limit of the necessary length associated with formula .

[25] to optimize (including the spatial parameters and the temporal parameters ) of each possible “inner logic” formula. In each iteration, the parameters are updated as a swarm of particles that move in the parameter space to find the global minimum (in this paper, we use 200 particles for each iteration). The formula with the smallest value of the cost function can be generated and selected. The cost function is as follows:

 Jstl(ϕ,α)=z∑q=1vq(ϕ(α),S)−λ1n∑k=1m(ϕ(α),k)+λ2np∑i=1dH([rgb]0,0,0O(Xi),O(pi(α)) (12)

where are weighting factors that can adjust the priorities of the different optimization goals (for tuning of , , see the example in Section IV).

### 3.3 Group Partition

As there may be subgroups in the group, we proceed to infer the subgroups based on the identified formula where minimizes ,

###### Definition 4.

The signature is defined as the satisfaction signature of the agent with respect to the “inner logic” formula at time in the -th observation trajectory. If the agent satisfies at time in the -th observation trajectory, then the signature is set to 1 at that time point; otherwise, it is set to 0.

We need to cluster the agents of the group into subgroups based on the satisfaction signature trajectories of different agents. For a given set of elements, the number of all possible partitions of the set where each partition has exactly non-empty subsets is the Stirling’s number of the second kind [26]. The search over all possible partitions of a set is a NP-complete problem, and the calculation soon becomes intractable when the number of elements in the set increases. In order to reduce the calculation, we further look into two kinds of relationships: complementarity and similarity.

We come back to the furniture moving scenario and assume that there are two subgroups of people who are moving the furnitures, and .
Case 1:
The two subgroups take turns to move the furnitures from Region 1 to Region 2. For example, if the people in subgroup move the furnitures for one hour, then the people in subgroup will move the furnitures for the next hour. Therefore, people in the same subgroup behave similarly.
Case 2:
There are people from both the two subgroups who move the furnitures from Region 1 to Region 2. For example, if there are always one person from subgroup and two people from subgroup who move the furnitures for one hour, then there will be always two people from subgroup and one person from subgroup who move the furnitures for the next hour. In this case, people in the same subgroup behave complementarily in the sense that a certain number of people in the subgroup should perform the task.

Overall, both complementary and similar relationships can lead to interesting group behaviors, but with their different nature they should be dealt with differently.

#### 3.3.1 Group Partition Based on Similarity

A lot of clustering methods are based on similarity. For example, k-means clustering is frequently used in partitioning

observations into

clusters, where each observation belongs to the cluster with the nearest mean. However, its performance can be distorted when clustering high-dimensional data

[27]. As we cluster different agents based on their satisfaction signatures at different time points (which is high-dimensional when dealing with lengthy time-series data), we need to use some other methods. One way is to represent the agents in the group as vertices of a weighted hypergraph (a hypergraph is an extension of a graph in the sense that each hyperedge can connect more than two vertices) and represent the relationship among different agents as hyperedges. Then the clustering problem is transformed to a hypergraph-partitioning problem where a number of graph-partitioning software packages can be utilized. For example, hMETIS is a software package that can partition large hypergraphs in a fast and efficient way [28]. hMETIS can partition the vertices of a hypergraph, such that the number of hyperedges connecting vertices in different parts is minimized (minimal cut). The complexity of hMETIS for a k-way partitioning is where V is the number of vertices and E is the number of edges [29]. In this paper, we modify the method in [29] which uses frequent item sets found by the association rule algorithm as hyperedges. Apriori algorithm [30] is often used in finding association rules in data mining. It proceeds by identifying the frequent individual items111A frequent individual item is an item that appears sufficiently often through time. and extending them to larger and larger frequent item sets222A frequent item set is an item set whose items simultaneously appear sufficiently often through time.. In this work, we consider the different agents as items and an item “appears” whenever the satisfaction signature of the agent is 1.

###### Definition 5.

We define the relative support of agent with respect to the “inner logic” formula as the proportion of satisfaction signatures of the agent with respect to that are not zero, as shown below:

 supp(k,ϕ(α∗))≜1z∑q=1lϕ(α∗),qz∑q=1lϕ(α∗),q∑j=1sq(ϕ(α∗),k,j) (13)

We give a simple example of 8 agents and 1 observation trajectory of 8 time points (representing 8 consecutive hours) for each agent in the furniture moving scenario with the signature listed in Table 1. We first put all agents that are identified as frequent individual items in , as shown below:

 Sf≜{k∈S | supp(k,ϕ(α∗))>minsup} (14)

where is a small positive number as a threshold for defining frequent item sets. It can be seen from Table 1 that agent 7 and agent 8 do not perform the “inner logic” task as frequently as the other players, so we set to exclude them from the partitioning process (in similarity relationships all the agents in each subgroup are expected to perform the task frequently). In this case, .

###### Definition 6.

The signature is defined as the satisfaction signature of the set with respect to the “inner logic” formula at time in the -th observation trajectory. If all the agents in the set satisfy at time in the -th observation trajectory, then the signature is set to 1 at that time point; otherwise, it is set to 0.

###### Definition 7.

We denote the relative support of a set with respect to the “inner logic” formula as the proportion of satisfaction signatures of the set with respect to that are not zero, as shown below:

 (15)

If the relative support of a set satisfies , then we assign a hyperedge connecting the vertices (agents) of , and the weight of hyperedge is defined as relative support :

 Weight(e)=supp(e,ϕ(α∗)) (16)

The fitness function that measures the quality of a partition (subgroup) is defined as follows:

 fitness(Sd)=∑e⊂SdWeight(Sd)∑|e∩Sd|>0Weight(Sd) (17)

The fitness function measures the ratio of weights of hyperedges that are within the partition and weights of hyperedges involving any vertex of this partition. High fitness value suggests that vertices within the partition are more connected to each other than to other vertices.

We find the largest number of subgroups partitioned using hMETIS while the fitness function of each subgroup stays above a given threshold value. In the example, the smallest number of subgroups is 2, and the best partition given by hMETIS is: and . The fitness value of the two subgroups are both 1, which is the highest possible value of the fitness function. If we increase the number of subgroups to 3, then the best partition is , and . The fitness value of the three subgroups are 0, 0.25 and 1. So it is clear that the best number of subgroups should be 2.

#### 3.3.2 Group Partition Based on Complementarity

In a complementarity relationship, the number of agents in a subgroup that perform the task is expected to be as constant as possible. For example, the proposition “at least 40 and at most 50 agents from a subgroup of 100 agents should perform the task” is deemed more precise than “at least 10 and at most 90 agents from a subgroup of 100 agents should perform the task”. One good measure of how far a set of numbers are spread out is the variance. If a subgroup

of agents act complementarily, then the variance of the number of agents in subgroup that perform the task at different time points should be small.

We still transform the clustering problem to a hypergraph-partitioning problem. The partitioning procedure and the definition of fitness functions are the same as the similarity relationship approach. The only differences are that we assign every possible set of vertices as a hyperedge and the weight of hyperedge is defined as follows:

 Weight(e)=1/(Var(e)+ϵ) (18)

where is a small positive number such as to avoid singularity in the case of , and the variance is defined as

 (19)

where

is a binary variable that describes whether vertex (agent)

belongs to hyperedge , i.e. =1 if vertex (agent) belongs to hyperedge and =0 otherwise.

In the furniture moving scenario, we give another example of 8 agents and 1 observation trajectory of 8 time points (representing 8 consecutive hours) for each agent with the signature listed in Table 2. We start from the smallest number of subgroups and the best partition given by hMETIS is: and . The fitness value of the two subgroups are 0.7354 and 0.7364. If we increase the number of subgroups to 3, then the best partition is , and . The fitness value of the three subgroups are 0, 0.7354 and 0.5789. So the best number of subgroups is 2. It can be seen from the table that there are always 1 agent from and 2 agents from that are satisfying at any time point.

### 3.4 “Outer logic” CensusSTL Formula Inference

After partitioning the group into several subgroups, we can proceed to generate the “outer logic” CensusSTL formula from the census trajectories.

We denote all parameters that define the “outer logic” formula as . The inference of the “outer logic” CensusSTL formula is also a constrained multi-objective optimization problem for finding the best parameters that describe the formula , and we use Particle Swarm Optimization to find . In the inference of the “Outer logic” formula, we specify the “outer logic” formula to be in the form of , with being the cause and being the effect formula. In this form, we can capture causal relationships that are maintained consistently during a time period. All the temporal parameters of and chosen in ( is the length of the time domain of the formula with respect to the census trajectories).

We consider 8 templates of temporal logic formula :

#### 3.4.1 Instantaneous Cause Durational Effect

 γ=□[0,Tc−τ2)(γcs⇒□[τ1,τ2)γes) (20)

where is the length of the census trajectories, and are the cause and effect CensusSTL formulae without temporal operators. This causal relationship means “during the next time units, whenever is true, then will always be true from the next to time units”.

#### 3.4.2 Instantaneous Cause Eventual Effect

 γ=□[0,Tc−τ2)(γcs⇒◊[τ1,τ2)γes), (21)

which means “during the next time units, whenever is true, then will be true at least once from the next to time units”.

#### 3.4.3 Instantaneous Cause Eventual Durational Effect

 γ=□[0,Tc−τ2−τ3)(γcs⇒◊[τ1,τ2)□[0,τ3)γes) (22)

which means “during the next time units, whenever is true, then will be true at least once from the next to time units and maintain to be true for time units”.

#### 3.4.4 Instantaneous Cause Persistent Effect

 γ=□[0,Tc−τ2−τ3)(γcs⇒□[τ1,τ2)◊[0,τ3)γes) (23)

which means “during the next time units, whenever is true, then from the next to time units will be true at least once every time units”.

#### 3.4.5 Durational Cause Durational Effect

 γ=□[0,Tc−τ3)(□[0,τ1)γcs⇒□[τ2,τ3)γes) (24)

which means “during the next time units, whenever is true for time units, then will always be true from to time units”.

#### 3.4.6 Durational Cause Eventual Effect

 γ=□[0,Tc−τ3)(□[0,τ1)γcs⇒◊[τ2,τ3)γes) (25)

which means “during the next time units, whenever is true for time units, then will be true at least once from to time units”.

#### 3.4.7 Durational Cause Eventual Durational Effect

 γ=□[0,Tc−τ3−τ4)(□[0,τ1)γcs⇒◊[τ2,τ3)□[0,τ4)γes) (26)

which means “during the next time units, whenever is true for time units, then will be true at least once from to time units and maintain to be true for time units”.

#### 3.4.8 Durational Cause Persistent Effect

 γ=□[0,Tc−τ3−τ4)(□[0,τ1)γcs⇒□[τ2,τ3)◊[0,τ4)γes) (27)

which means “during the next time units, whenever is true for time units, then from to time units will be true at least once every time units”.

###### Definition 8.

We define as the total number of time points at which the formula is true in the training data set (the time points in different census trajectories are counted separately).

###### Definition 9.

We define as the accuracy rate of the CensusSTL formula in the training data set, and its value is calculated as below:

 p(γc⇒γe)=m(γc∧γe)/m(γc) (28)

is generally a number between 0 and 1, but in the case of its value becomes infinity. To avoid this, we specify to be -1 when in the calculations of the objective functions introduced in the following.

The optimization has three objectives in general: the first objective is to maximize , which is the percent value of the accuracy rate of the formula in the training data set; the second objective is to maximize so as to maximize the frequency of the formula in the training data set; the last objective is to make the formula more precise.

Specifically, for similarity based partitioning, we make the number of agents in a subgroup that perform the task as large as possible (ideally the same as the number of agents in the subgroup), so the optimization is formulated as follows:

 min−100p(γc(β)⇒γe(β))−λ′1m(γc(β))−λ′2(ci1+cj2)subject~{}toγ∈Φγ,γcs=n(ϕ,Si)>ci1(i=1,2,…,ns),γes=n(ϕ,Sj)>cj2(j=1,2,…,ns),

where (including the temporal parameters and ) is the optimization variable, is the number of subgroups partitioned based on similarity, , are the weighting factors (for tuning of , , see the example in Section IV), is the set of the eight templates of . For any of the 8 templates, there are different CensusSTL formula as both (which contains ) and (which contains ) can be about any of the subgroups.

In the furniture moving scenario from Tab. 1, one of the best formula obatained is as follows:

 □[0,4)(□[0,2)n(ϕ,Ss1)>2⇒□[2,4)(n(ϕ,Ss2)>2) (29)

which means “For the next 4 hours, whenever the 3 agents from move the furnitures from Region 1 to Region 2 for 2 hours, then the 3 agents from will be moving the furnitures from Region 1 to Region 2 for the next 2 hours.

For complementarity based partitioning, we make the number of agents in a subgroup that perform the task as constant as possible, the optimization is formulated as follows:

where (including the temporal parameters and ) is the optimization variable, is the number of subgroups partitioned based on complementarity, , are the weighting factors (for tuning of , , see the example in Section IV).

In the furniture moving scenario from Tab. 2, one of the best formula obtained is as follows:

 □[0,6)(n(ϕ,Sc1)>0∧n(ϕ,Sc1)<2∧n(ϕ,Sc2)>1∧n(ϕ,Sc2)<3⇒□[0,2)(n(ϕ,Sc1)>0∧n(ϕ,Sc1)<2∧n(ϕ,Sc2)>1∧n(ϕ,Sc2)<3)) (30)

which means “For the next 6 hours, whenever there are 1 agent from and 2 agents from who are moving the furnitures from Region 1 to Region 2, then for the next 2 hours there will still be 1 agent from , and 2 agents from who move the furnitures from Region 1 to Region 2 .

### 3.5 CensusSTL Formula Validation

The obtained CensusSTL formula is validated in a separate validation data set. The accuracy rate of the formula in the validation data set is as follows:

 pv(γc⇒γe)=mv(γc∧γe)/mv(γc) (31)

where is the total number of time points at which the CensusSTL formula is true in the validation data set.

## 4 Implementation

In order to test the effectiveness of the algorithm, we consider a dataset from a soccer match that happened on November 7th, 2013 between Troms IL (Norway) and Anzhi Makhachkala (Russia) at Alfheim stadium in Troms, Norway. Troms IL will be referred to as the home team and the Anzhi Makhachkala as the visiting team. The players of the home team are equipped with body-sensors during the whole game. The body-sensor data and video camera data of the players of the home team are provided in [31]. The x-axis points southwards parallel with the long side of the field, while the y-axis points eastwards parallel with the short edge of the field, as shown in Fig. 2. The soccer pitch is and hence the values for x and y are in the range of and if the players are in the field.

We focus on the situation that a player in the home team is attacking in the visiting team’s half field and then the player suddenly runs back to the home field. This usually happens because the ball is intercepted by the visiting team who launches a counterattack and the players in the home team run back for defense. For example, at 17 minutes 36 seconds (as shown in Fig. 3(a)) many players are in the visiting team’s half field, then at 17 minutes 47 seconds most players run back to the home field (as shown in Fig. 3(b)). We call it a runback situation and we want to derive a CensusSTL formula for the behaviors of different subgroups of the home team. As the runback task is a sequential task, we select the following STL formula to be the template for the “inner logic” formula:

 ϕ=◊[−τ3−τ4,0]ϕt