Inferring Coordination Strategies from Time Series of Movement Data

11/04/2019 ∙ by Chainarong Amornbunchornvej, et al. ∙ 0

How do groups of individuals achieve consensus in movement decisions? Do individuals follow their friends, the one predetermined leader, or whomever just happens to be nearby? To address these questions computationally, we formalize Coordination Strategy Inference Problem. In this setting, a group of multiple individuals moves in a coordinated manner towards a target path. Each individual uses a specific strategy to follow others (e.g. nearest neighbors, pre-defined leaders, preferred friends). Given a set of time series that includes coordinated movement and a set of candidate strategies as inputs, we provide the first methodology (to the best of our knowledge) to infer the set of strategies that each individual uses to achieve movement coordination at the group level. We evaluate and demonstrate the performance of the proposed framework by predicting the direction of movement of an individual in a group in both simulated datasets as well as two real-world datasets: a school of fish and a troop of baboons. Moreover, since there is no prior methodology for inferring individual-level strategies, we compare our framework with the state-of-the-art approach for the task of classification of group-level-coordination models. The results show that our approach is highly accurate in inferring the correct strategy in simulated datasets even in complicated mixed strategy settings, which no existing method can infer. In the task of classification of group-level-coordination models, our framework performs better than the state-of-the-art approach in all datasets. Animal data experiments show that fish, as expected, follow their neighbors, while baboons have a preference to follow specific individuals. Our methodology generalizes to arbitrary time series data of real numbers, beyond movement data.



There are no comments yet.


page 2

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Figure 1. An example of GPS-collar trajectories of Olive baboons living in Mpala Research Centre, Kenya (Crofoot et al., 2015; Strandburg-Peshkin et al., 2015). In this event, the troop is forming coordinated movement.

Coordination is a form of group behavior aimed to make the group achieve a collective goal (Malone and Crowston, 1994). During the decision-making process, a collective goal is to reach a group’s consensus, which is defined as the state when all individuals share a common agreement (Cao et al., 2013). One of the mechanisms by which a group can achieve a collective goal is leadership, which is a process of pattern initiation by specific individuals, leaders, then followed by the rest (Amornbunchornvej et al., 2018). In behavioral studies, coordination problems, such as group decision making, coordinated movement, group hunting, social conflicts, and territorial defense, can be solved by leadership (Dyer et al., 2009; Krause et al., 2000). Typically, leaders might not be explicit or global to a group, yet the group can still create coordinated movement via a local strategy (e.g. individuals follow their neighbors) (Dyer et al., 2009). Moreover, many groups of individuals in Nature have neither leaders nor central authority, but these groups are capable of forming coordination patterns (Valentini, 2019; Ray et al., 2019; Hrncir et al., 2019), such as honey bees (Hrncir et al., 2019), slime molds (Ray et al., 2019), etc.

Figure 2. An overview of the proposed framework. Given a set of time series as inputs, 1), the framework detects coordination intervals, 2) infers the optimal strategy from a set of candidates that optimally fit the training data, and 3) reports the optimal strategy for each individual from validation data.

In cooperative control of multi-agent systems, the field focuses on how to design a local strategy for each agent so that the group can achieve collective goals (Lewis et al., 2013; Cao et al., 2013; Valentini, 2019). Many systems have been designed by inspiration of natural collective behaviors such as a folk of birds, a school of fish, etc. (Valentini, 2019). Recently, patterns of opinion formation that emerge from dynamic behaviors of social networks are studied in the view of multi-agent systems (Proskurnikov and Tempo, 2018; Anderson and Ye, 2019).

Agents can communicate only with their neighbors via a communication network, which is defined by any neighborhood concept in some space (Lewis et al., 2013). There is a large body of work in multi-agent systems that proposes local synchronization strategies (Lewis et al., 2013; Cao et al., 2013; Etesami, 2019). In behavioral studies, the work in  (Dyer et al., 2009; Strandburg-Peshkin and et al., 2013) tried to model the coordination process via a concept of information spreading. A small number of informed agents can spread information through a large number of uninformed agents, which results in the group’s consensus and coordinated movement. The work by Chazelle (Chazelle, 2011) introduced a model, namely a reversible agreement system, that guarantees convergence of the group state, with or without leaders. In more complicated settings, the works in (Amornbunchornvej and Berger-Wolf, 2018; Etesami, 2019) provided the analysis of multiagent network systems that can form coordination where networks of relations of agent interactions can change over time. In online social networks, there is also a “Diffusion Model” (Kempe et al., 2003; Goyal et al., 2010; He and Kempe, 2016) that models an information spreading process among individuals that results in the entire network reaching a common state.

However, in this paper, we focus on the inverse question of inferring the local strategies collective individuals use to achieve a state of coordination. There are only a few studies that address this question. The works by Farine et al. (Farine et al., 2016) found that wild baboons can achieve the state of coordinated movement within a group by following their neighbors or long-term associates, depending on the time scale of the coordination process.

There are several studies that look at the collective behavior of fish. For example, the work in  (Gautrais et al., 2012) modeled and inferred the rules of movement coordination of fish, which is affected by the group size; Herbert-Read et al. (Herbert-Read et al., 2011) reported that the rules of movement coordination of fish mainly depend on attraction forces of the group; and Katz et al. (Katz et al., 2011) showed that fish tend to imitate the direction of neighbors ahead. The work in (Mann et al., 2013; Langrock et al., 2014) proposed model selection methods to infer the animal-behavior model, but they cannot be used to find models that guarantee coordination.

1.1. The current state of the art approach

The work in  (Amornbunchornvej et al., 2018) provided a framework, FLICA, for leadership inference and model classification in time series data. FLICA considers the shape of time series to infer pairwise relationship who follows whom (instead of considering only directions or positions of individuals). Hence, FLICA subsumes all previous methods (Amornbunchornvej et al., 2018) including FLOCK patterns leadership (Andersson et al., 2008), time-lag following leadership (Kjargaard et al., 2013), etc. FLICA can infer an underlying possible group model that generated coordination via a classification method. However, FLICA cannot be used to infer individual-level strategies that collectively combine to coordinated movement at the group level. In fact, each individual within a group can use a different strategy to achieve collective coordination (Proposition 3.4). Hence, in this paper, we develop an approach to fill this methodological gap. Note that we use the words ‘model’, ‘mechanism’, and ‘strategy’ interchangeably.

1.2. Our Contributions

In order to fill the gap in the literature, we formalize Coordination Strategy Inference Problem, analyze theoretical properties of a strategy that guarantees coordination, propose hierarchical and non-hierarchical strategies that guarantee coordination, as well as propose a computational framework to infer, from time-series data, individual-level coordination strategies. Given a set of candidate strategies and time series of coordinated movement, our framework is capable of:

  • Inferring the latent strategies: inferring the best fit set of mixed or pure strategy for agents that provide the lowest loss value for the task of predicting the direction of movement; and

  • Movement prediction: predicting the direction of the next move of each agent when the optimal strategy is unknown, using the set of the inferred latent strategies.

We evaluate and demonstrate the performance of our framework on simulated datasets as well as real-world datasets of animal movement. On simulated data, the task is to infer the correct latent coordination model that was used to generate the simulated time series of coordinated movement. We use the baboon dataset to predict the next movement to find which strategies each baboon likely used to coordinate its movement. Lastly, in fish datasets, we show how to apply the framework to do the model selection to address a hypothesis about the original model that the fish use to achieve coordinated movement.

Coordination Strategy Inference Problem: To reach a group consensus, individuals have to coordinate with others. There are many strategies each individual can use to achieve coordination at the group level. Given time series of individual activities and a set of candidate strategies, the goal is to find the set of original strategies individuals used that lead to the group consensus.

2. Preliminaries and Definitions

We use the following notation throughout the paper:

  • is a set of agents.

  • is a set of informed agents.

  • is a state value of agent at time , where .

  • is a set of individual states at at time .

  • is a state time series of agent where is a length of time series.

  • is a target path where is a target state at time .

  • is a set of strategy functions that agents use to update their current state where .

  • is a set of state time series generated by agents using some set of strategy functions .

  • is a noise-tolerance threshold.

Given a set of agents with a set of their initial states , these agents generate a set of state time series , where is the state time series of agent . For each time step , each agent updates its state via a strategy function : . However, an informed agent always has its state the same as a target path : .

2.1. Initiator of coordination

We use the definitions of coordination, following relation, and coordination initiator from (Amornbunchornvej et al., 2018). Let be a set of time series. Let denote the time series equal to that starts at time , that is , and be any similarity function over time series. We then define the similarity function of a following relation between two time series (similarity with a time shift):


We can also define the minimum time delay of a following relation below: In Eq. 2, if there are multiple time delays that have the same (similar patterns repeated many times), then we choose the minimum value of these time delays to represent the time delay between two time series that share similar patterns. For example, if a pattern is repeated periodically, Eq 2 will ensure that the first iteration will be chosen.

Definition 2.1 (-Following relation).

Let and be time series. If and the time delay , then is followed by , denoted by . In the case that , then is strictly followed by , denoted by .

That is, follows if is sufficiently similar to , with a time delay.

Definition 2.2 (Coordination interval).

Let be a set of time series. For any interval if , either or , then is a coordination interval.

That is, a coordination interval is the time when everybody either follows or is followed by somebody.

Definition 2.3 (Initiator).

Let be a set of time series and be a coordination interval of . For any , if , , then is an initiator of coordination interval .

The initiator is the one who is followed by everybody during coordination.

Definition 2.4 (Coordination event).

Let be a set of time series. If there exists any coordination interval in , then is a coordination event.

Definition 2.5 (Coordination strategy).

Let be a set of strategy functions that the agents use to generate a set of state time series . Each agent uses a function to update its state for each time step. is a set of coordination strategies of if is a coordination event.

Note that if all agents follow the target path , then an informed agent is an initiator of coordination.

2.2. Problem formalization

Suppose there is a set of state time series that was generated by an unknown set of latent coordination strategies w.r.t. some unknown . The only available inputs are and the entire set . The goal is to find . The real identity of the target path is unknown, but it is known that . Before formalizing the problem, we define the risk function to measure the fitness of any that might be in , for any agent :



is a loss function and

returns a predicted state . Now, we are ready to formalize Coordination Strategy Inference Problem. In the next section, we introduce a concept of convergence in multi-agent systems and the relationship between convergence and coordination strategy.


3. Models and properties

3.1. Convergence and coordination strategy

For the convergence of multi-agent systems, we adopt a notion of -convergence from (Chazelle, 2011).

Definition 3.1 (-convergence).

Given , a system, which is a set of strategy functions, is said to -converge if, for 111 In the work by Chazelle (Chazelle, 2011), at any time , two agents that move and make a distance between them still less than is considered as a trivial step. The bound is defined to ignore microscopic motions. , there exists a time constant such that for all , a set of agent’s states can be partitioned into disjoint subsets, where the maximum distance between any pair of agents’ states from the same subset is less than or equal .

Definition 3.2 (-convergence of time series).

Given two time series , we say that -converges toward at time if, for all time , the distance between and is less than or equal , where .

Proposition 3.3 ().

Suppose , if all time series generated by a set of strategy functions -converge toward a target path , then is a set of coordination strategies, where .


Suppose all time series generated by a set of strategy functions -converge toward a target path . At the converging time every agent’s state is within its group convex hull centered at that has the diameter at most . For some time , every time series has a distance between each other at most . By setting , this implies that every time series -follows time series . By assigning all agents that have the state time series the same as to be informed agents, since others follow with some time delay, therefore, we have the -coordination interval and all informed agents are initiators. ∎

Proposition 3.4 ().

Let be a set of pure strategy functions. If all agents use any as a pure strategy function and their state time series -converge toward a target path , then a mixed strategy function , created by a linear combination of functions in , generates a time series that -converges toward .


Suppose all functions in generate state time series that -converge toward . At the equilibrium time , when all strategies converge, any strategy in that agent uses ensures that ’s state is in the convex hull of states centered at and has a diameter at most , since a linear combination of values within a convex hull is still in a convex hull. Therefore, a mixed strategy function that is created by a linear combination of functions in generates a time series that -converges toward . ∎

3.2. Convergence models

3.2.1. Hierarchical Model Dynamic System (HM)

Figure 3. An example of communication networks between and (above). These networks are the realization of the probabilistic following network (below). The arrows represent the directed edges while the dashed lines are empty edges. When the time step increases, the informed agent can increasingly spread its state (orange node) to more follower nodes (blue nodes).

Let be an informed agent. Let a directed acyclic graph (DAG) be graph, where is a set of agent nodes and is a set of probabilistic edges, so that if

is a probability that

follows , then has the weight . We call a probabilistic following network. In this model, is connected and every node has a path to a leader node . For every time step , the system generates a communication network , which is a realization of . The example of the process of generation of a communication network is shown in Fig. 3.

Let be a set of agent’s initial states, be a set of neighbors of in that follows, and be a target path. At any time , the informed agent updates its state to be . For any other uninformed agent , it updates the state according to the aggregation of its neighbors’ states. Formally, we have a strategy function for this model as follows:


Agents use the above strategy function to update the state in this model. In cooperative control literature, the Eq. 4 is called a local voting protocol (Lewis et al., 2013). A system is known to converge if each communication network stays the same all the time and has a spanning tree that has a leader node as the root (Lewis et al., 2013). This is why must be connected in order to make a system converge.

Theorem 3.5 ().

Let be a set of agents’ initial states within Euclidean space. Given a symmetric distance function . If all agents use HM strategy (Eq. 4) to update their states, then all agents’ state time series -converge toward a target state with the expectation of the convergence time at most time steps if for all and .


In the first time step, forms a convex hull and is inside this convex hull because . For any agent , according to Eq. 4, the distance reduces by half whenever the link . Let

be a random variable of the number of steps it takes until the appearance of a link

such that , using trials. We can find the expectation of the time , i.e., the expected number of trials until .

From Eq. 4,

Then, by definition of the Binomial expectation,


In general, we can have an upper bound of the expectation of the convergence time as follows:


According to Theorem 3.5 and Proposition 3.3, if the target path has its target state as a fixed point: for all , then the set of strategy functions that contains only HM strategy functions is a set of coordination strategies. In other words, if all agents use to update their states, then their states converge to a target path. Therefore, a coordination interval exists in their state time series. In contrast, if a target state can be changed, the the group still follows the path , because only influences the group and ’s state path is . However, the convergence might not exist if the difference between two consecutive time steps within the target path is always greater than the group convergent rate.

3.2.2. Local Reversible Agreement system (LRA)

Let be a set of physical points, be a set of initial states, be a target path, be an informed agent who updates its state in correspondence to , and

be a projection function that agents use to update their physical points. If a state point is a velocity vector, then the projection function is simply the current position plus the velocity vector. First, for

, we update the physical point . Second, we create a set of Delaunay triangulations from to create a communication network . If and form the same triangle within the physical space, then . Third, we update a state of each agent based on the structure of . The example of how to find the neighbors of each individual in LRA is in Fig. 4, which defines physical points as positions of individuals and states as movement directions. Given a triangulation membership function such that if form the same triangulation (note that ), otherwise it is zero. We have a strategy function for LRA as follows.

Figure 4. An example of physical points as positions and state points as directions. In position space (above), the individual (red node) has all gray nodes as its neighbors in LRA since they are neighbors in Delaunay triangulation. In the direction space (below), updates its next direction to be A rather than B since B is outside the ’s neighbor convex hull.

The difference between (Eq. 4) and (Eq. 5) is that infers the next state based on a fixed structure of a probabilistic following network , independently from the physical space , whereas predicts the next state based on the physical space . In other words,, represents an assumption that an agent follows a fixed set of specific individuals w.r.t. the preference graph regardless of their relative physical position, while represents an assumption that an agent follows anyone who happens to be around without any preference to follow specific individuals. The next theorem shows that the Local Reversible Agreement is -convergent.

Theorem 3.6 (Chazelle 2011(Chazelle, 2011)).

For any , an -agent reversible agreement system is -converged in time ). Where is the time-independent agreement parameter corresponding to the system.222In a Bidirectional agreement system, which is a general model of a reversible agreement system, the condition is a necessary condition to make systems converging (Chazelle, 2011).

According to the work by Chazelle (Chazelle, 2011), LRA is still converged even if one of the agents does not update. In our case, if is the same for every time step, then the fixed agent is who always has .

Corollary 3.7 ().

The -agent LRA that has being created from Delaunay triangulation sets converges to a single point.


The graph that is built from Delaunay triangulation is always connected. For each time step, each agent converges to the center of the neighbors’ convex hull. Since everyone is connected and the system is -converge, by transitivity, the entire group converges to the single point. ∎

In fact, if the fixed point is , then, at the equilibrium point, all states form a convex hull around with the diameter at most  (Chazelle, 2011). In contrast, if is not always the same, then the group moves following with some time delay.

The Corollary 3.7 tells us that if we follow our physical neighbors (e.g. directions) and everyone does the same thing, the entire group will reach the same consensus (moving to the same direction). In general, if is strongly connected, everyone follows neighbors in , and there is one individual who never follows anyone, then the group converges to ’s state. Additionally, Corollary 3.7 is always true in any metric space where a Delaunay triangulation exists.

According to Corollary 3.7 and Proposition 3.3, if a target state never changes: for all , then the set of strategy functions that contains only LRA strategy functions is a set of coordination strategies.

3.2.3. Discussion

According to Theorem 3.5, Corollary 3.7, Proposition 3.3, and Proposition 3.4, if the data has coordination behaviors, then either HM, LRA, or a mix of those strategies may be the cause of the coordination. However, the question still remains regarding how to infer which strategy is the cause of the coordination. In the next section, we propose a solution to address this question.

4. Method

We are now ready to formally state our approach of inferring movement coordination strategies of agents represented by a collection of time series.

4.1. Setting

Figure 5. An example of movement strategy inference for . Given the information on positions and directions of individuals in the past (blue and green nodes), we want to infer the ’s strategy of movement that can be whether that ’s next direction follows its neighbors (A node), or follows specific individuals (B node), or neither (C node).

We define a movement direction as a state, but the approach generalizes to arbitrary definitions of states that are defined on Euclidean space. Hence, is a set of time series of direction. We use direction, rather than position, to define the state of an individual and the proxy for collective coordination. The main reason is that directional coordination is common in biology. For example, in (Katz et al., 2011), the authors report that a fish tends to imitate the direction of neighbors ahead to form collective movement, and other examples abound. Secondly, synchronization to the same direction implies a collective movement while synchronization to the same position implies staying in the same position without movement. In this paper, we focus on coordination of movement, therefore, we cannot use positions as states to infer strategies of movement. The final reason for defining states as directions is to use a dimension independent of the positions, which we use to define states of individual strategies. We need to differentiate between the strategy that an individual follows specific individuals’ direction regardless of their physical neighbors’ choices of direction versus the strategy that an individual follows their physical neighbors’ direction without any preference to follow specific individuals. We assume the following are given as inputs: a set of possible strategy functions , a collection of position-time-series sets , and a collection of direction-time-series sets , where and . The data record of th coordination event consists of a pair of that were generated by agents moving in two-dimensional position space to form directional coordination; all agents coordinately move to the same direction in this interval. Each contains a coordination interval. The goal to to infer the set of strategy functions that generated . The framework overview is in Fig. 2.

For simplicity of the exposition, we deploy three strategy functions for our framework: HM, LRA, and Auto regressive model (AR). Again, other candidate strategies are admissible. However, these three strategies are canonical exemplars since they make it possible to determine whether the strategy functions that generated a time series of directions of each agent is more hierarchical (HM), or it is more dependent on the physically proximity neighbors (LRA), or it is just a simple function of the agent’s past history, independent of its neighbors. We separate and to be a training part, , to perform a model fitting, and a validation part, , to perform a model selection. In the case that the input is only a single physical time series , we use FLICA framework (Amornbunchornvej et al., 2018) to find coordination events and treat each event as a single . Hence, we have containing multiple coordination events from . Then, we create a set of direction-time-series sets from . The example of movement strategy inference is in Fig. 5

4.2. Model fitting


We concatenate all time series in to be a single time series and also concatenate to be . Then we use to perform model fitting.

Before proceeding with the model fitting, the HM strategy function requires a probabilistic following network . We infer from by using FLICA (Amornbunchornvej et al., 2018) to create a dynamic following network of . In this paper, the time window threshold of FLICA has been set at time steps. In the next step, we find a global-leadership ranking, then we aggregate and normalize this dynamic network to be a DAG probabilistic network, such that the high-rank agents do not have a probabilistic following edge to low-rank agents in . After we have , we calculate as follows:


where is a probabilistic weight of edge . For the LRA strategy function, we use the same function as in Eq 5. Lastly, we apply auto regressive model to fit on to represent . The predicts the next state of the agent w.r.t. the average of the states from previous steps in . In this paper, we set . As mentioned before, we focus on three strategy functions: (Eq 6), (Eq 5), and . We can view them as a mixed strategy, given a weight vector .


Here , is a support of HM, is a support of LRA, is a support of an auto regressive model, and . We use the sum square error (SSE) as our loss function. Our main goal is to find that minimizes below:


where is a difference between predicted and actual direction in which agent moved at time . For each agent , given and a threshold vector , we can find the optimal support vector as the optimization problem:

subject to

We use the Interior point algorithm (Byrd et al., 1999), which is a large-scale algorithm, to solve Problem 9, which can be consider as a constrained linear least-squares problem. A threshold represents a model bias toward specific strategies. For example, if we have prior information that, with high probability, an agent uses LRA strategy function, then we can set to enforce the optimizer to vary the support within interval instead of the interval. The benefit of having is to prevent overfitting. For any agent , suppose is the optimal solution of an optimization problem 9 w.r.t. , then we call a model. The pseudo code of the model fitting is given in Algorithm LABEL:algo:ModelFittingFunction.

4.3. Model selection


First, we vary and find a model for each agent from . As the result, we have a set of models that is now used to perform model selection for an agent . We concatenate all time series in to be a single time series and also concatenate to be . Finally, for each agent , we find the optimal support vector using the equation below:


After we get the support vector , if is the highest support in , then we say that agent uses the HM strategy function to coordinate with its group. If has the highest support, then we say that follows its physical neighbors to coordinate with the group. If has the highest support, then just follows its own linear path independently, and if ’s path is the target path then is an informed agent. Lastly, if at least two of , , show significantly high weights, then we conclude that uses a mixed strategy. The pseudo code of the model selection is given in Algorithm LABEL:algo:ModelSelectionFunction.

5. Experimental setup

We test our approach both on simulated and on biological data.

5.1. Simulations

We generated a set of time series of 2-dimensional positions by four different sets of strategy functions: ,,, and . We define a set of state-time-series as a set of time series of directional degrees of , where is a time series of positions of an agent ; is time series of directional degrees of an agent derived from a position time series ; and is a degree angle between a direction vector and -axis direction vector . Note that we need to be careful also of the distance between any and since and have a degree difference of only degree but certainly have different implications for coordination.


Where . We have only as an input for our framework since we can create from . In all simulated datasets, there are 20 agents and ID(1) is the informed agent. ID(1) creates the target path by uniformly and randomly choosing a fixed direction as the initial state, then continuing to move in the direction of until the end of coordination. For full details, please see the supplementary.

5.1.1. Hierarchical Model Dynamic System

In this system, we used a set of strategy function to generate where all is (Eq. 4). The parameter in this model is the following probability . We set the probability weight of all edges in a probabilistic following network equal to . The communication network generated by is used to update the directional state by the strategy function . All 19 agents always follow only ID(1) with the probability . In other words, all nodes have edges to ID(1) with the weight in . For each coordination event, it lasts 400 time steps. So, s.t. . We vary . For each , we generated 100 coordination events. In total, we have 400 datasets.

5.1.2. Local Reversible agreement system

We created 100 other datasets for the LRA system. We used a set of strategy function to generate where all is (Eq. 5). For each dataset, it contains a set of time series of positions from 20 agents, , where . All agents updates their state corresponding to their local neighbors’ states using a strategy function .

5.1.3. Hierarchical and Local Reversible agreement system

We created 100 other datasets of HM & LRA coordination events by . We use this simulation to represent the group that has a coordination interval even if some agents use the HM strategy function but others use the LRA strategy function. For each dataset, it contains a set of position time series from 20 agents, , where . The ID(1) is the informed agent. Agents who possess ID(2-10) use with . The rest of ID(11-20) agents use .

5.1.4. Mixed strategy system

Lastly, we created 100 other datasets of mixed strategy of coordination events. For each dataset, it contains a set of 20-agent position time series where is time series of positions of agent . The ID(1) is the informed agent. Other agents updates their state corresponding to either with probability or with probability .

5.1.5. Evaluation

In this section, we evaluate the task of inference of the latent strategies given that we know the set of possible strategies. For each model, we performed 10-fold cross validation to evaluate the performance. For each round of cross validation, we have 100 datasets that can be separated into 45 training datasets, 45 validation datasets, and 10 testing datasets. We concatenated all time series in to be a single time series and also concatenate to be . Then we use to evaluate the direction prediction performance. We compare four strategy functions: , , , and , which is our framework optimal strategy function derived from Eq. 7 and 10. We use the risk function that has Eq. 11 as a loss function to evaluate the model performance.


For each agent , the best fitting model is the model that minimizes the risk function in Eq. 12.


For each strategy function , we report the distribution of loss values of direction prediction from all agents in each time step as well as the group’s average optimal support from Eq. 10. If the framework performs well, then it should give the highest support for the model that generated the dataset.

5.2. Baboon behavioral experiment

The dataset is the recording of GPS collars of an olive baboon (Papio anubis) troop in the wild in Mpala Research Centre, Kenya (Strandburg-Peshkin et al., 2015). The GPS was recorded at 1 Hz from 7am until 7pm. The dataset consists of 16 individuals whose GPS trackers remained functional for 10 days. A 2-dimensional trajectory of latitude and longitude for each individual has a length of 419,095 time steps. We extracted coordination events by FLICA varying the network density threshold at 25th, 50th, 75th, and 99th percentile and the time window at 240 time steps to infer coordination events and time steps to infer a dynamic following network. We used the 10-fold cross validation to report the results. For each round of cross validation, it has 45% of training, 45% of validation, and 10% of testing coordination events. The remainder of the evaluation follows the description in the Evaluation Section. We use this experiment to demonstrate the ability of our framework to predict the next movement direction of agents even when the optimal strategy is unknown. The result can be used to generate (and test) hypotheses about the latent coordination strategies in collective movement data.

5.3. Fish behavioral experiment

We used the time series of golden shiners (Notemigonus crysoleucas) fish positions from (Strandburg-Peshkin and et al., 2013). The dataset was initially created to study information propagation via the fish visual fields (Strandburg-Peshkin and et al., 2013). In total, there were 24 trails of fish position time series in 2-dimensional space. For each , it consists of 70 fish, with 10 trained fish who are considered to be informed agents in our setting. On average, the time series in has its length around 600 time steps. The trained fish moved toward the feeding site (the target path) and the group follows them. Due to the lack of information of identity for each individual in the different trails, we cannot train our framework in this dataset. Hence, we use fish data to demonstrate how to apply our framework to compare performance of each candidate strategy on direction prediction.

We compared the Informed strategy function against in Eq. 5. For each time step, updates for any agent from the average of where is a trained fish. We use the risk function in Eq. 12 to compare the performance among these strategy functions. For each strategy function , we report the distribution of all agents’ direction prediction error in each time step from .

5.4. Comparison with the state of the art method

Our method is the first approach to infer individual-level strategies that lead to group-level coordination. Thus, we compare our framework with the-state-of-the-art method, FLICA (Amornbunchornvej et al., 2018), for the task of leadership model classification. Since FLICA cannot infer the individual-level strategy, we evaluate both frameworks at the group-level classification task. We use simulated datasets from Section 5.1. Each set of time series has its label from one of the four models: HM, LRA, HM & LRA, and Mix strategy model. FLICA maps each set of time series to the leadership ranking and convex hull features. In our framework, we use the median of (Eq. 10

) to represent the feature vector of each dataset. We use 10-fold cross validation on Random Forests

(Ho, 1998) to report the evaluation results for both frameworks.

6. Results

6.1. Simulations

Average degree prediction error
Datasets\Strategies OPT HM LRA AR
HM 12.40 12.98 20.49 30.21*
LRA 7.77* 16.93* 7.76* 13.78*
HM & LRA 4.42 13.39* 13.59 23.87*
Mixed Str. 29.33* 30.53* 31.69* 46.28*
Random 89.74* 90.11* 89.70* 90.21*
Baboon 53.16* 53.16* 72.36* 85.84*
Table 1. The result of predicting the direction of movement via 10-fold cross validation. We compared the result of our framework (OPT) against the base-line pure strategies: HM, LRA, and AR (auto regressive strategy). (*indicates the STD )

The results of inferring the coordination strategy in simulated datasets are shown in Table 1. A row represents the results from datasets generated by a specific model. A column represents a strategy prediction error measured in degree units . OPT is the optimal strategy function trained by our framework. HM is Eq. 6. LRA is Eq. 5. AR is the auto regressive strategy function that chooses the current direction based on the previous five time steps from the same agent. We use AR as the baseline. In all datasets, our framework (OPT column) has the smallest error among all other strategies. For the first two rows of HM and LRA datasets, OPT has almost the same performance as the strategies used to generate the data (HM row/column and LRA row/column). For HM & LRA datasets in the third row, each individual might use either HM or LRA strategy. Hence, using the homogeneous strategy to predict directions for all agents results in larger error (HM and LRA column). On the contrary, our framework can detect which individual uses which strategy. Hence, OPT performed better than all pure strategies. Similarly, for the mixed strategy datasets (Mixed Str. row), each individual might use either HM or LRA as its strategy with the probability 0.5. Since our framework can infer mixed strategies, it performed better than using any pure strategy. Lastly, we reported the results of the direction prediction from the 100 datasets of time series generated from agents moving uniformly and randomly in any direction (Random row). The result shows that all strategies included in our framework produced the same bad result with the loss value at degree. This shows that our framework does not find an artifact model where none exists.

Average Support (predict/actual)
Datasets :HM :LRA :AR
HM 0.85/1.00 0.12/0.00 0.03/0.00
LRA 0.02/0.00 0.98/1.00 0.00/0.00
(HM part)
1.00/1.00 0.00/0.00 0.00/0.00
(LRA part)
0.00/0.00 1.00/1.00 0.00/0.00
Mixed Strategy 0.48/0.50 0.48/0.50 0.04/0.00
Random 0.09/0.00 0.86/0.00 0.05/0.00
Baboon 1.00/NA 0.00/NA 0.00/NA
Table 2. The average optimal support vector of all agents from 10-fold cross validation, inferred by our framework from simulated and the Baboon datasets.

Table 2 shows the support vectors for each strategy corresponding to the datasets in Table 1 in the OPT column. For each element in the table, the first number is the predicted support from our framework and the second is the actual support that we used to create the datasets. For example, in the first element of HM row, 0.85/1.00 means we used HM strategy to create HM datasets and the framework inferred the HM support in these datasets as 0.85. Overall, our framework correctly inferred the support vectors of all non-random datasets, while avoiding overfitting.

6.2. Baboon behavioral experiment

We varied the threshold of the following network density to infer coordination events in the baboon dataset. We report the average result from all the thresholds. The last row of Table 1 shows the result of the direction prediction of baboons, using different coordination strategies. The OPT coordination strategy, as derived by our framework, is in the last row in Table 2. According to the result, OPT used HM as the pure strategy. The errors of HM and OPT strategies suggest that baboons may have a slight preference to follow a pre-determined individual or a set of individuals, rather than their neighbors in the position space. This is consistent with the biological understanding of the baboon social behavior (Farine et al., 2016). However, the more accurate strategy should be investigated and biologically verified.

6.3. Fish behavioral experiment

Error of degree prediction []
Strategies Mean STD
LRA 41.51 45.11
Informed Strategy 54.46 47.68
Table 3. Comparison between LRA and Informed strategies to predict directions of 24 trails of fish444The reason that fish datasets have their own table while other datasets are in another table is because of the following reason. To use 10-fold cross validation, we have to be able to learn each individual strategy from one set of coordination events (training datasets) to predict the strategy of the same individual in another set of coordination events (validation datasets). In fish datasets, there are 24 fish coordination events. However, fish datasets lack of individual identities. Precisely, two individuals with the same ID from two different fish-coordination events might not be the same individual. In contrast, two individuals with the same ID from two different coordination events are always the same individual in both simulation and baboon datasets. Hence, we cannot use 10-fold cross validation procedure on fish datasets the same way as we did on baboon and simulation datasets..

The results of the direction prediction in fish datasets, for LRA and Informed strategies, are in Table 4. The LRA performed better than the Informed strategy, indicating that fish follow their immediate neighbors in space. This result is supported by the work in (Strandburg-Peshkin and et al., 2013; Katz et al., 2011) and many others, showing that fish do not directly know who leads the group but follow their neighbors.

6.4. Comparison with the state of the art method

FLICA Proposed Method
Classes Prec. Rec. F1 Prec. Rec. F1
HM 1 0.75 0.86 1 1 1
LRA 0.8 1 0.89 1 1 1
HM & LRA 0.94 1 0.97 0.98 1 0.99
Mixed Str. 0.90 0.94 0.92 1 0.98 0.99
Random 1 0.9 0.95 1 1 1
Table 4. The results of model classification of FLICA and the proposed framework via 10-fold cross validation. We use Random Forest for classification.

The result of model classification using FLICA as well as the proposed framework is in Table 4. In all datasets, the proposed framework performed better than FLICA. This indicates that the group-level features that FLICA provides for classification are not sufficiently informative to be used to categorize complicated datasets where individuals may use a heterogeneous set of strategies (e.g. HM & LRA).

For the baboon and fish datasets, since there is no ground truth available regarding classes of strategies, we can only discuss the result here. The FLICA result of classification in (Amornbunchornvej et al., 2018) stated that baboons used a linear threshold model to form coordination; there is no association of orders of movement velocity and position of individuals vs. ranking of movement initiation. In other words, initiators do not necessary move first or in a front of a group. In this work, Table 1 suggests that there is a hierarchy among baboons; baboons trend to follow the directions of specific individuals. This result is consistent with the result in (Amornbunchornvej and Berger-Wolf, 2019) that performed analysis on the same baboon dataset, which showed that there are several pairs of baboons that follow each other with high supports in various situations. For the fish datasets, the result of FLICA framework (Amornbunchornvej et al., 2018) suggests that trained fish truly initiated coordination movement. In this work, Table 4 suggests that schools of fish used LRA strategy; individuals in school of fish do not follow trained fish directly, but they follow their neighbors.

7. Conclusions

In this paper, we formalized a new computational problem, Coordination Strategy Inference Problem. Given a set of candidate strategies and a set of time series of coordinated movement as inputs, our goal is to infer the original strategy that each individual used to achieve the group coordination. We showed that a strategy that has the convergence property can guarantee that the group reaches coordination. We provide the first approach to infer the set of strategies that each individual uses to achieve movement coordination at the group level. We evaluated and demonstrated our framework performance in simulated datasets as well as two biological datasets: baboon and fish. Our framework was able to infer the original set of strategy functions that generated each simulated dataset. The results show that our approach is highly accurate in inferring the correct strategy in simulated datasets even in complicated mixed strategy settings. Moreover, our framework performed classification of group-level coordination models from time series better than FLICA framework, which is the-state-of-the-art approach for the task. Animal data experiments show that fishes, unsurprisingly, follow their neighbors, while baboons have a preference to follow specific individuals. Although we used the specific setting of focusing on the direction of movement as the definition of an agent’s state and used three exemplar candidate strategy, our methodology easily generalizes to arbitrary time series data on Euclidean space, beyond movement data, and other candidate strategies. While for the fairness of comparison with the biological datasets we used simulated data of 20 individuals, it is clear that there are no inherent limitations in the approach to scale to much larger datasets. The only barrier is the availability of data. The code and datasets that we used in this paper can be found at  (Sha, [n. d.]).


  • (1)
  • Sha ([n. d.]) [n. d.]. Coordination Model Selection Framework code and data. ([n. d.]). Accessed: 2018-02-06.
  • Amornbunchornvej and Berger-Wolf (2018) Chainarong Amornbunchornvej and Tanya Berger-Wolf. 2018. Framework for inferring leadership dynamics of complex movement from time series. In Proceedings of the 2018 SIAM International Conference on Data Mining. SIAM, 549–557.
  • Amornbunchornvej and Berger-Wolf (2019) Chainarong Amornbunchornvej and Tanya Y. Berger-Wolf. 2019. Mining and modeling complex leadership–followership dynamics of movement data. Social Network Analysis and Mining 9, 1 (03 Oct 2019), 58.
  • Amornbunchornvej et al. (2018) Chainarong Amornbunchornvej, Ivan Brugere, Ariana Strandburg-Peshkin, Damien Farine, Margaret C Crofoot, and Tanya Y Berger-Wolf. 2018. Coordination Event Detection and Initiator Identification in Time Series Data. ACM Trans. Knowl. Discov. Data 12, 5, Article 53 (6 2018), 33 pages.
  • Anderson and Ye (2019) Brian D. O. Anderson and Mengbin Ye. 2019. Recent Advances in the Modelling and Analysis of Opinion Dynamics on Influence Networks. International Journal of Automation and Computing 16, 2 (01 Apr 2019), 129–149.
  • Andersson et al. (2008) Mattias Andersson, Joachim Gudmundsson, Patrick Laube, and Thomas Wolle. 2008. Reporting leaders and followers among trajectories of moving point objects. GeoInformatica 12, 4 (2008), 497–528.
  • Byrd et al. (1999) Richard H. Byrd, Mary E. Hribar, and Jorge Nocedal. 1999. An Interior Point Algorithm for Large-Scale Nonlinear Programming. SIAM Journal on Optimization 9, 4 (1999), 877–900.
  • Cao et al. (2013) Y. Cao, W. Yu, W. Ren, and G. Chen. 2013. An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination. IEEE Transactions on Industrial Informatics 9, 1 (Feb 2013), 427–438.
  • Chazelle (2011) Bernard Chazelle. 2011. The Total s-Energy of a Multiagent System. SIAM Journal on Control and Optimization 49, 4 (2011), 1680–1706. arXiv:
  • Crofoot et al. (2015) Margaret C Crofoot, Roland W Kays, and Martin Wikelski. 2015. Data from: Shared decision-making drives collective movement in wild baboons. (2015).
  • Dyer et al. (2009) John RG Dyer, Anders Johansson, Dirk Helbing, Iain D Couzin, and Jens Krause. 2009. Leadership, consensus decision making and collective behaviour in humans. Philosophical Transactions of the Royal Society of London B: Biological Sciences 364, 1518 (2009), 781–789.
  • Etesami (2019) S Rasoul Etesami. 2019. A Simple Framework for Stability Analysis of State-Dependent Networks of Heterogeneous Agents. SIAM Journal on Control and Optimization 57, 3 (2019), 1757–1782.
  • Farine et al. (2016) Damien R Farine, Ariana Strandburg-Peshkin, Tanya Berger-Wolf, Brian Ziebart, Ivan Brugere, Jia Li, and Margaret C Crofoot. 2016. Both nearest neighbours and long-term affiliates predict individual locations during collective movement in wild baboons. Scientific reports 6 (2016), 27704.
  • Gautrais et al. (2012) Jacques Gautrais, Francesco Ginelli, Richard Fournier, Stéphane Blanco, Marc Soria, Hugues Chaté, and Guy Theraulaz. 2012. Deciphering Interactions in Moving Animal Groups. PLOS Computational Biology 8, 9 (09 2012), 1–11.
  • Goyal et al. (2010) Amit Goyal, Francesco Bonchi, and Laks VS Lakshmanan. 2010. Learning influence probabilities in social networks. In Proceedings of the third ACM international conference on Web search and data mining. ACM, 241–250.
  • He and Kempe (2016) Xinran He and David Kempe. 2016. Robust Influence Maximization. In Proceedings of the ninth ACM SIGKDD. ACM, 1–10.
  • Herbert-Read et al. (2011) James E. Herbert-Read, Andrea Perna, Richard P. Mann, Timothy M. Schaerf, David J. T. Sumpter, and Ashley J. W. Ward. 2011. Inferring the rules of interaction of shoaling fish. Proceedings of the National Academy of Sciences 108, 46 (2011), 18726–18731.
  • Ho (1998) Tin Kam Ho. 1998. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 8 (1998), 832–844.
  • Hrncir et al. (2019) Michael Hrncir, Camila Maia-Silva, and Walter M. Farina. 2019.

    Honey bee workers generate low-frequency vibrations that are reliable indicators of their activity level.

    Journal of Comparative Physiology A 205, 1 (01 Feb 2019), 79–86.
  • Katz et al. (2011) Yael Katz, Kolbjørn Tunstrøm, Christos C. Ioannou, Cristián Huepe, and Iain D. Couzin. 2011. Inferring the structure and dynamics of interactions in schooling fish. Proc. of the National Academy of Sciences 108, 46 (2011), 18720–18725.
  • Kempe et al. (2003) David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the spread of influence through a social network. In Proceedings of the ninth ACM SIGKDD. ACM, 137–146.
  • Kjargaard et al. (2013) Mikkel Baun Kjargaard, Henrik Blunck, Markus Wustenberg, Kaj Gronbask, Martin Wirz, Daniel Roggen, and Gerhard Troster. 2013. Time-lag method for detecting following and leadership behavior of pedestrians from mobile sensing data. In Proceedings of the IEEE PerCom. IEEE, 56–64.
  • Krause et al. (2000) J Krause, D Hoare, S Krause, CK Hemelrijk, and DI Rubenstein. 2000. Leadership in fish shoals. Fish and Fisheries 1, 1 (2000), 82–89.
  • Langrock et al. (2014) Roland Langrock, J. Grant C. Hopcraft, Paul G. Blackwell, Victoria Goodall, Ruth King, Mu Niu, Toby A. Patterson, Martin W. Pedersen, Anna Skarin, and Robert S. Schick. 2014. Modelling group dynamic animal movement. Methods in Ecology and Evolution 5, 2 (2014), 190–199.
  • Lewis et al. (2013) Frank L Lewis, Hongwei Zhang, Kristian Hengster-Movric, and Abhijit Das. 2013. Cooperative control of multi-agent systems: optimal and adaptive design approaches. Springer Science & Business Media.
  • Malone and Crowston (1994) Thomas W Malone and Kevin Crowston. 1994. The interdisciplinary study of coordination. ACM Computing Surveys (CSUR) 26, 1 (1994), 87–119.
  • Mann et al. (2013) Richard P. Mann, Andrea Perna, Daniel Strömbom, Roman Garnett, James E. Herbert-Read, David J. T. Sumpter, and Ashley J. W. Ward. 2013. Multi-scale Inference of Interaction Rules in Animal Groups Using Bayesian Model Selection. PLOS Computational Biology 9, 3 (03 2013), 1–13.
  • Proskurnikov and Tempo (2018) Anton V. Proskurnikov and Roberto Tempo. 2018. A tutorial on modeling and analysis of dynamic social networks. Part II. Annual Reviews in Control 45 (2018), 166 – 190.
  • Ray et al. (2019) Subash K Ray, Gabriele Valentini, Purva Shah, Abid Haque, Chris R Reid, Gregory F Weber, and Simon Garnier. 2019. Information transfer during food choice in the slime mold Physarum polycephalum. Frontiers in Ecology and Evolution 7 (2019), 67.
  • Strandburg-Peshkin and et al. (2013) A. Strandburg-Peshkin and et al. 2013. Visual sensory networks and effective information transfer in animal groups. Current Biology 23, 17 (2013), R709–R711.
  • Strandburg-Peshkin et al. (2015) Ariana Strandburg-Peshkin, Damien R Farine, Iain D Couzin, and Margaret C Crofoot. 2015. Shared decision-making drives collective movement in wild baboons. Science 348, 6241 (2015), 1358–1361.
  • Valentini (2019) Gabriele Valentini. 2019. How robots in a large group make decisions as a whole? From biological inspiration to the design of distributed algorithms. arXiv preprint arXiv:1910.11262 (2019).