Recently, there has been a great amount of research in mining frequent behaviors from trajectory data. A trajectory is an N dimensional path that does not necessarily correspond to a physical object, but may correspond to the time evolution of an N dimensional feature vector. In this paper, we propose a novel approach, inspired by the concept of motion patterns, for extracting dominant trajectory behaviors; while our method will be described for the case of 2 dimensional trajectories, it can be extended to higher dimensions for further applications. Seeking to find common behaviors in trajectory data, some methods group whole trajectories into clusters while others attempt to mine regional patterns that trajectories follow during part of their evolutions. Our proposed method aligns more closely to the latter category. In this section, we begin with introducing a subset of very rich literature in trajectory clustering and try to keep a balance between the recent methods and popular ones. Due to the existence of a large number of methods in the literature and the space limitation, selecting a subset of the literature is inevitable. Next, we explain how methods performing scene activity understanding in video sequences can be seen as trajectory clustering tools. And finally, we intuitively introduce our motion pattern approach as a general tool for understanding trajectory behaviors.
Many of the existing trajectory clustering methods approach the problem by first defining a similarity function for trajectories and then using one of the well-established clustering procedures. For instance, 
compared clustering results obtained by using the trajectory similarity measures Dynamic Time Warping (DTW), Longest Common Subsequence (LCSS), and modified Hausdorff distance in combination with agglomerative and spectral clustering. Fu used the average distance between corresponding trajectory points as the similarity measure, which required pre-processing and resampling of trajectories. The clustering was later applied in two steps, with the first step producing clusters corresponding to the larger dominant paths, and the second one subsequently refining results obtained by the first one. Morris and Trivedi  surveyed performance of a wide variety of distance measures and clustering algorithms on several datasets with varying characteristics. Specifically, they compared trajectories using , PCA , DTW , LCSS , , and modified Hausdorff  distance and then obtained clusters using direct , divisive , agglomerative , hybrid , graph  and spectral clustering techniques . Ferreira  have proposed Vector field Kmeans that treats the trajectories as a whole and attempts to follow an iterative model similar in form to Kmeans clustering. In this approach, clusters are subsets of the trajectories. First, the trajectories are partitioned randomly into K clusters. Then, for each cluster, the trajectories in that cluster are used to form one of K vector fields. Further, clusters are updated by assigning each trajectory to the vector field that it fits best. These two steps are repeated in an iterative fashion until a convergence criteria is met. DivCluST proposed by Wu  is a partition based approach that clusters trajectories according to their speed in addition to their shape and position. First, it partitions the trajectories into representative segments by combining adjacent trajectory segments that have similar direction and speed. Then, a method similar to Kmeans is used to find cluster centers. The clusters are initialized as random subsets of the representative segments. Next, the representative segments of each cluster are averaged to form a cluster center. Finally, as in Kmeans, each of the representative segments is associated with a cluster center, then each cluster center is updated to be the average of the cluster’s representative segments. This process will be repeated until a convergence criteria is met. Ulm  have proposed a trajectory clustering approach that is distinctive in its online clustering ability. It does not explicitly treat the trajectories as a whole, but in practice, it behaves as if it does. They defined a cluster as a vector field defined on a subset of the spatial range. When a trajectory is received, it is converted into a new vector field. If at any time two vector fields are similar, they are merged together into one. This method uses the entire trajectory to make a vector field and merging process largely conserves the shape of both fields that are being merged. Thus, the algorithm in practice behaves similarly to those approaches that treat the trajectories as a whole.
Giannotti  proposed a very different way of treating trajectories where their density is used to form frequently visited rectangular regions. Then, each of the trajectories is represented as a sequence of visited regions with the transition times. They mined common subsequences with similar transition times that represent commonly taken paths similar in interpretation to trajectory clusters. It is worth mentioning that because of the exact phrasing of the problem, this approach gives very redundant clusters in practice. Lee  observed that clustering trajectories as a whole fails to reveal portions of trajectories that exhibit a common behavior. To address this, they proposed TRACLUS, an algorithm that first partitions trajectories into a set of line segments and then groups similar segments into clusters using a density-based clustering algorithm for line segments (largely analogous to DBSCAN). This work has been further extended in , which proposed a feature generation framework, TraClass, that uses a combination of region-based and trajectory-based clusterings. Specifically, TraClass finds homogeneous regions where trajectories of a certain class are predominant, then uses a class-conscious adaptation of TRACLUS to group trajectory partitions into clusters that have high discriminative power. After generating features in this manner, TraClass maps trajectories into a feature vector, where each entry corresponds to a region-based or a trajectory-based cluster.
1.1 Scene Activity Understanding Methods
Given the decreasing cost of collecting data and the great importance of surveillance videos for providing security and monitoring ongoing activities, many researchers, mostly in the computer vision community, dedicated major efforts to develop automatic scene modeling and intelligent activity understanding systems. These research works aim to learn frequent movement behaviors and activity profiles in videos. Using that, they can also detect abnormal behaviors and in some cases improve object tracking performance. Given a video sequence, some of these methods obtain trajectories associated with the moving objects in the scene, while others, due to the challenges such as highly dense scenes, frequent moving object occlusions or poor object tracking performance, either compute short-term trajectories (tracklets) or compute sparse or dense optical flows 
in the feature extraction phase. Then, different kinds of algorithms are employed to infer frequent motion behaviors taking place in the video sequences from these extracted features. In this paper we will refer to these frequent motion behaviors as motion patterns. Semantically, motion patterns are very similar to the regional patterns mined by those groups of trajectory clustering methods that seek common behaviors in subsegments of trajectories. Therefore, we claim that a motion pattern approach can be employed to solve the general task of trajectory clustering.
Junejo  cluster similar trajectories by performing a min-cut graph clustering algorithm recursively. Each node in the graph represents a trajectory while the weights of the edges between nodes are determined by the Hausdorff distance between corresponding trajectories. Modeled paths are later used to detect unusual trajectories based on spatial velocity and curvature features. 
computes instantaneous flow vectors that include location and velocity information from either optical flows or long-term/short-term trajectories. Given the location of a point and it’s velocity (based on flow vectors), a kernel based estimation similar to mean shift approach is employed to generate the corresponding velocity in the next step. The next location of each point can be estimated using its current location and computed velocity. Performing this procedure recursively will generate the sink path that estimates the movement path of the initial point. Next, obtained sinks are clustered based on the similarity of their location and directions in addition to the Hausdorff distance between their sink paths. These clusters represent the frequent motion behaviors. Basharat 
have proposed a framework that does not explicitly generate motion patterns; however, the per-pixel learned multivariate probability density function can be used to understand the frequent motion behaviors in the scene. Given a set of trajectories obtained from moving objects, transition vectors that represent size of the moving object, transition time and location of the object after the transition are computed. Then, for any given location in the scene, transition vectors that pass through that location contribute to a multivariate Gaussian Mixture Model (GMM) that models the pixel-level pdf. Given such a model, most likely paths can be generated by initializing the starting points in the same fashion as
or by taking a Markov Chain Monte Carlo (MCMC) sampling based scheme such as. Hu  compute flow vectors by extracting sparse or dense optical flow from video sequences. Then, they create a neighborhood directional graph that indicates the distance between flow vector pairs. Finally, a hierarchical agglomerative clustering algorithm generates motion patterns as the graph clusters of the neighborhood graph. Saleemi 
have proposed a probabilistic framework to model scene dynamics. Given the long-term/short-term trajectories of moving objects in a video sequence obtained from an object tracking algorithm, a 5 dimensional feature space is created where each feature point represents the location information of initial and final stages of a moving object and the duration of the time interval for such a transition. Then, motion patterns , in the form of a multivariate pdf of spatiotemporal parameters, will be learned using kernel density estimation. Finally, a Markov Chain Monte Carlo (MCMC) sampling based scheme is utilized to sample the pdf. For any starting location as the initial state, MCMC sampling based scheme generates the probable random walks that are most likely to happen in the scene in a progressive procedure. These likely paths express the most probable paths that trajectories have taken in the training stage. Therefore, those are semantically similar to motion patterns. Lasdas present a method to extract dominant dynamic properties in crowded scenes. Their proposed pipeline begins with extracting fixed length tracklets using KLT tracker . Then, a set of validation tests and pre-processing steps refine the tracklet set. In order to have a more robust representation, a grid of equally spaced points is overlaid on the scene. The pipeline continues with clustering tracklets in the neighborhood of each of these points using a mean shift clustering algorithm with respect to the direction of the tracklets. Next, mean flow vectors that are assigned to each grid point are modeled as a Gaussian Process (GP) using . Finally, given a starting point and the mixture of GP regression models, a sequential sampling scheme generates the trajectories that follow the most frequently observed patterns in the training phase. These generated paths are supposed to be the most probable paths that might originate from the starting point and are semantically similar to the motion patterns.
Zhao and Medioni 
proposed a framework to infer motion patterns in videos in an unsupervised learning fashion. Given a video with moving objects, they extract foreground motion blobs (connected pixels that belong to a moving object) using Robust Alignment by Sparse and Low Rank Decomposition (RASL). Performing local associations using  and 
, they obtain the initial tracklets. Here the tracklets are sequences of ordered spatial coordinates of motion blobs. Then, using a 2 dimensional Tensor Voting algorithm, they compute the tangent direction of every tracklet point. The direction information provided after 2 dimensional Tensor Voting is less noisy and more consistent than transition vectors from one observation to the next one. Further, a similarity graph of tracklets based on multiple kernel types is generated and, employing a graph spectral clustering approach, the nonlinear manifold of tracklets is grouped into segments. These segments indicate local motion patterns; while  focuses on global motion patterns. Therefore, a kernel density estimation algorithm is utilized to propagate the local motion pattern information to all the pixels in the scene and form global motion patterns. As the last piece of work to cover in this section, we briefly explain the approach that 
took to provide a scene understanding framework, a framework that we believe is capable of solving the trajectory clustering task. The pipeline begins with dividing a video into disjoint video clips and extracting dense optical flow from them. Then, each video clip is represented as a 4 dimensional feature space that contains location, magnitude and orientations of the instantaneous optical flow vectors belonging to that particular video clip. Next, each of the 4 dimensional feature spaces are modeled as a mixture of Gaussians where Kmeans clustering is employed to initialize the model parameters. The pipeline continues with generating a graph that treats each Gaussian component as a node where the edge between two nodes is determined by their reachability from each other and associated temporal proximity. To clarify, the reachability is defined as the probability of observing a component in the neighborhood of the other one in a specific time interval. Finally, through a connected component analysis on the aforementioned graph, motion patterns are obtained. Since similar motion patterns might occur at different time stamps, Kullback-Leibler (KL) divergence is employed to merge similar motion patterns happening at different times.
1.2 Proposed Approach
We explained the trajectory clustering problem and gave a brief overview of a group of methods that seek to find regional patterns in trajectory data rather than clustering whole trajectories. Then we showed that the motion pattern inference methods trying to perform scene understanding in video sequences are semantically generating the same output as the aforementioned trajectory clustering methods. Therefore, we claim that a motion pattern approach can be employed to solve the general task of trajectory clustering. In this paper, we propose a novel framework that is designed to overcome the challenges provided by the trajectory clustering task. Given a set of trajectories, we break them down into sets of flow vectors and generate a 4 dimensional feature space. Each point in this feature space represents location and velocity of a flow vector. Then, using Kmeans clustering, we represent the feature space in a less noisy and compact form. Clusters obtained from Kmeans clustering are called motion components. Next, we define the reachability set in which a pair of two motion components exist if they are reachable from each other. The reachability set maintains local similarities between motion components and cannot evaluate motion component’s contribution in generating the global trajectory behaviors. Hence, we define the signature which represents global behavior of a motion component. Finally, a weighted Jaccard distance between motion components’ signatures is used as the distance metric in an agglomerative clustering scheme to find motion patterns in trajectory data. These motion patterns illustrate regional behaviors of trajectories and are semantically similar to the trajectory clusters of those methods which break trajectories into segments and seek for regional similarities among them. The main contributions of our proposed method are as follows:
We propose a motion pattern approach to mine trajectory clusters of arbitrary shapes, simultaneously in dense and sparse regions of data. Generally, trajectory clustering methods either mine clusters in dense regions and lose clusters in regions where trajectories are spread sparsely or extract clusters in sparse regions while generating redundant clusters in dense regions.
The proposed approach is capable of recovering the whole trajectory cluster as a single motion pattern even in cases of highly curved underlying behaviors. Most of the trajectory clustering methods break down the whole cluster into segments as they fail to establish the connection between the segments when trajectories happen to have highly curved shapes.
By defining the signatures of motion components, we combine the local and global contributions of motion components such that it helps us distinguish between trajectory clusters that are identical in some localities but are different in the global scale. This strategy is useful for mining trajectory clusters that merge together or split into multiple ones.
Generally, trajectory clustering methods cluster similar trajectories that are spread over a wide region into multiple clusters as model parameters are set globally. However, our proposed approach generates only a single cluster in these cases as the capability of allowing semi-lateral movement is embedded in its framework.
The proposed method is able to correctly cluster two spatially overlapping groups of trajectories that differ only in the movement direction.
The remainder of this paper is organized as follows. Section 2 presents the problem definition and formulation details of our proposed method. Then, the detailed description of the experiments is given in Section 3, followed by the implementation details in Section 4. In Section 5, we discuss how our motion pattern approach can be used to explore temporal changes in frequent trends in the data with applications in various domains. Finally, we conclude the paper in Section 6.
In this section, we will introduce our algorithm for mining frequent trajectory behaviors. Each block in our pipeline will be explained in detail while associated parameters will be discussed separately in the implementation details (Section 4).
2.1 Flow Vector Computation
Given a dataset of N trajectories, we represent its trajectory with length as an ordered sequence of points where denotes the location of trajectory at time and . Then its corresponding flow vector is a set of 4 dimensional vectors where . The first and second dimensions of the flow vector indicate its location while the third and forth dimensions indicate its velocity. After obtaining flow vectors from all trajectories in the dataset, we will have a vast set of 4 dimensional flow vectors () that indicate magnitude () and direction () for the movement of each trajectory at all locations in the 2 dimensional surface (x and y). This data is noisy and we cannot directly use it to understand the behavioral movement of trajectories. Therefore, a Kmeans clustering algorithm is employed to group it into segments of spatially proximal flow vectors that have similar velocity properties. Such segmentation of the flow vector set provides us homogeneous regional segments that are representative and less noisy. We should note that, since in many cases trajectories are defined as sequences of points and the actual time stamp for each point is not available, only the order of the sequence is used in computing the velocity.
2.2 Motion Component Extraction
In order to generate the intermediate representation of the data that is less noisy and more suitable for the process of mining motion patterns, we cluster the flow vector set using Kmeans clustering where the number of clusters is set to K. Obtained clusters are the motion components. The distance measure between flow vector of trajectory and flow vector of trajectory used to create motion components is shown in Equation 2.2. Note that defines the weight of velocity similarity versus spatial proximity, of flow vectors in formation of motion components and will be discussed in implementation details section.
As the output of this step, we will have K motion components that are represented by their 4 dimensional means as where . We also denote the vector that spatially connects motion component to the one by . This definition will be used in the following section to determine the reachability of two motion components. Given a sample trajectory set, Figure 1 illustrates the steps which our method takes to extract motion components. We should note that unlike  and  that aim at learning a mixture of probability density functions or a unified pdf, respectively, to model the underlying dynamics of the entire scene, we cluster the flow vectors only to generate intermediate local representations.
2.3 Motion Component Reachability Set Creation
We now look for between the motion components. Mathematically, reachability is an asymmetric relation between motion components. Intuitively, motion component A is from motion component B if a particle traveling with the initial motion and position prescribed by A could reasonably be expected to proceed to travel to B. By reachability, we mean direct reachability that does not require any intermediate motion components between A and B. However, this intuitive concept must be formalized and defined mathematically in order to be used. We formalize reachability as the conjunction of three conditions: The motion components A and B must be spatially located close together, The direction of flow vector of A should be similar to the direction of the vector , Direction of flow vectors for A and B should be similar. Intuitively, the second condition enforces flow vector of A to be aligned with shortest spatial path from A to B. To realize these conditions, we define the , denoted by , of the (as A) and the (as B) motion components in Equation 2 where is the scale of the double-ellipse such that the motion component falls on the boundary of the double-ellipse of the motion component. Note that if is smaller than 1, then the motion component lies within the double-ellipse. Figures 2 and 2, respectively, show how the wedge and double-ellipse are located with respect to the motion component.
and are proximity parameters and will be discussed in detail in the implementation details section. Figure 2 illustrates how and are defined. Given the proximity measure in Equation 2, we define the reachability set as . Aiming to mine motion patterns, it is disadvantageous to have a motion component that cannot reach any other motion component (given , ) or no motion component is capable of reaching it (given , ), a condition we refer to as a motion component being . Such a situation would lead to breaking down motion patterns into smaller partitions, preventing us from extracting complete movement behaviors. In order to minimize blocked motion components, for each motion component , we include in the reachability set the pair () where is minimum with respect to j and less than search distance. In the same fashion the pair () should be included. Finally, it is advantageous to be able to detect short range semi-lateral movements. This property will allow us to merge similar but parallel motion patterns together. In order to accommodate this, we include (,) pair in the reachability set if the motion component lies within a low radius-wide angle circular sector (wedge) aligned with respect to motion component and their flow directions are similar. Formalizing the described condition, we include (,) pair in reachability set if and . We should note that in ,  or any other method that learn the probability density function of the underlying dynamics of the entire data, reachability can be defined between any two points in the data space by employing approaches like MCMC or sequential sampling. On the other hand, methods like ,  and this work define reachability only between their intermediate representations. These intermediate representations are the GMM components, GP regression model of tracklets and motion components in aforementioned works, respectively. Our intermediate representations, motion components, are rough approximations of GMM components in  as we do not estimate the covariance matrices of the motion components.
2.4 Forming Motion Patterns
We use two concepts of path reachability and signature to form motion patterns. The motion component is path reachable from the motion component if there is a chain of motion components from the to the motion component such that each link is in the reachability set. The signature of the motion component is the set of motion components that are either path reachable from the motion component or from which the motion component is path reachable. This is a novel representation that provides local and global properties of motion components simultaneously and therefore comparison between two motion components using their corresponding signatures would reflect both their local similarities and their contribution in global behaviors. Mathematically speaking, the reachability set can be shown as a directed graph while each node represents a motion component. An edge exists between two nodes if and only if the pair of motion components corresponding to those two nodes exists in the reachability set. Given a node, we can obtain the signature by applying depth first search on the graph and the reversed graph. Finally, the distance between two motion components is defined as the weighted Jaccard distance () between their associated signatures (
). The weighting is necessary because in practice, some areas have higher density of motion components (many motion components are located in a small region). This effect will skew our results since two motion components might be deemed similar by the unweighted Jaccard distance if their paths are quite different but they intersect in a region with many motion components. To counteract this effect, the Jaccard distance is weighted by a factor that assigns low values to the motion components that are located in dense regions and vice versa. The precise definition for the weighted Jaccard distance between the signatures of theand motion components is formalized in Equation 5. We extract motion patterns via agglomerative clustering with a distance cutoff, single linkage and WJD as distance metric. It is worth mentioning that when single linkage is used, the results will be the same as clustering via thresholding and forming weakly connected components. Despite this, our method is different from  as we use the weighted Jaccard distance between motion components’ signatures as the distance metric.  forms motion patterns by finding weakly connected components on a graph that represents the reachability between GMM components where only local similarity of motion components is considered. Therefore, it would group globally different motion behaviors together if they share a common GMM component (note that GMM component in  have actual time information), a scenario that occurs in merging or diverging of motion patterns.
Given a sample set of motion components, Figure 3 illustrates the steps which our method takes to form motion patterns in a trajectory dataset.
3 Experimental Results
In this section, we first give a brief description of the datasets that were used for evaluation. Then, we show experimental results of our proposed trajectory clustering method on different datasets and finally compare them to the baseline methods. Due to the space limitation, the model parameters used to generate outputs associated with different datasets are provided in supplemental material.
We used five different datasets to test the proposed method. These datasets are among the ones that are used by many papers in the literature and vary greatly in their properties, such as the number of trajectories, average number of points per trajectory, sampling density, spatial separation and complexity. Experimental results indicate that our proposed method is an effective solution for the trajectory clustering task regardless of the dataset properties.
Vehicle Motion Trajectory Dataset: This dataset contains 1500 trajectories gathered by tracking vehicles at a traffic intersection. These trajectories are annotated manually; each trajectory is assigned to one of 15 trajectory classes. The mean number of points per trajectory is 96. This dataset is available at .
Atlantic Hurricane Dataset (HURDAT2): This dataset is provided by the National Hurricane Service (NHS) and contains 1740 trajectories of Atlantic Hurricanes from 1851 through 2012, with trajectories containing 27 points on average. NHS also provides annotations of typical hurricane tracks for each month throughout the annual hurricane season that spans from June to November. In order to evaluate how close the motion patterns mined by our method are to the NHS annotations, we divided the Atlantic Hurricane Dataset into six subsets, one for each month. Trajectories that span more than one month were split to ensure each month includes only activity occurring within its span. This dataset is available at .
Swainson’s Hawks Dataset: This dataset contains 43 trajectories that trace the migration of Swainson’s hawks. A description of the hawks’ migration paths is provided in , which states that the hawks converge on the Gulf of Mexico coast, travel southward following a narrow path across the Andes in Colombia, then proceed along the east side of the Andes to central Argentina, where they spend the austral summer before returning north using largely the same route. The average number of points per trajectory in this dataset is 105. This dataset is available at .
The Greek Trucks Dataset: This dataset contains 1100 trajectories from 50 different trucks delivering concrete around Athens, Greece. As expected, the trucks follow highways giving the trajectories a distinctive appearance. The average number of points per trajectory is 86. This dataset is available at .
The NGSIM Lankershim Dataset: This dataset contains detailed vehicle trajectory data on Lankershim Boulevard in the Universal City neighborhood of Los Angles, CA on June 16, 2005. The dataset corresponds to two 15-minutes periods of 8:30 am to 8:45 am and 8:45 am to 9:00 am obtained by five video cameras. For our experiments, we extracted portions of trajectories captured by camera NO. 2 in 8:30 am to 8:45 am period as it is covering the busiest intersection in the dataset. This subset contains 1095 trajectories and the average number of points per trajectory is 305. The full Lankershim dataset and its annotations are available at .
For all experiments, data was first normalized so that all the trajectories were in the bounding box determined by and . The Atlantic Hurricane Dataset went through further pre-processing: after splitting the dataset by months, each of the resulting subsets was pruned by removing trajectories consisting solely of a single coordinate pair repeated one or more times.
3.3 Intrepreting Output
Before we discuss experimental results, we first briefly explain how to read and interpret the output of the proposed algorithm. Each discovered motion pattern is displayed on top of plotted trajectories. To avoid overcrowding the figures, only a random subset of trajectories are visualized. The color of the motion pattern denotes the direction of the motion. A color wheel that appears in each figure serves as the legend for translating color into direction of motion. Lastly, we must introduce the reasoning behind our handling of merging and diverging trajectories. An example of such a scenario is illustrated in Figure 4, where two clusters of trajectories partially overlap. Our proposed method recovers three motion patterns in this case: two separate motion patterns for the distinct portions of the two clusters and a separate motion pattern for their merged portion. Therefore, it must be noted that a single trajectory may pass through several motion patterns. Using our proposed method, trajectories shown in Figure 4 produce motion patterns shown in Figures 4, 4 and 4.
3.4 Evaluations on Vehicle Motion Trajectory Dataset
Examining the outputs of the proposed method versus annotations shown in Figure 5, we see that the proposed algorithm recovers the motion patterns present in the dataset, with a few differences from the annotated version. First of all, the annotations categorize the traffic in each of the parallel lanes in Figure 5 into separate clusters. Because the motion in the left group of three lanes is very similar, the proposed algorithm recovers it as a single motion pattern. The same applies to the three-lane group on the right. Second, for reasons stated above, the turning trajectories shown in Figure 5 are segmented to differentiate their distinct portions.
Figures 5 and 5 correspond to the two directions of traffic on the main highways. Figure 5 corresponds to one of the left turns and is especially large since it contains the bulk of the left-side outgoing traffic. Figures 5, 5, and 5 correspond to the incoming traffic from the left side. This is an example of the diverging behavior where the motion pattern from Figure 5 diverges into the motion pattern turning right in Figure 5 and the motion pattern going straight in Figure 5. Motion patterns in Figures 5 and 5 reflect a U-turn, where the former is a portion of the U-turn that merges with the traffic shown in Figure 5 and the latter is the distinct part of the U-turn. Figures 5 and 5 show two different access roads. Figure 5 shows a right turn and Figure 5 shows a rarely taken left turn.
3.5 Evaluations on Atlantic Hurricane Dataset
We evaluated the performance of the proposed method on the Atlantic Hurricane Dataset by comparing its output with the annotations provided by the National Hurricane Service (NHS). Prevailing hurricane tracks for each month are indicated by white arrows in Figures 6, 6, 6, 6, 6 and 6. Examining the output for each month, we see that the proposed algorithm recovers motion patterns closely resembling the prevailing hurricane tracks for June in 6 and 6, for October in 6 and 6, and for November in 6. For July, 6 reflects the rightmost track, while 6 reflects the shared path of the leftmost tracks, and 6 and 6 serve as the distinct portions of the two leftmost paths. For August, 6 reflects the bottom left prevailing track, while 6 and 6 correspond to the splitting tracks on the top right. September hurricane tracks in 6 are split into two groups of three. The left group of three is output as motion patterns 6, 6, 6, while the right group of three is output as 6 for the arrow pointing northwest and 6 for the two arrows turning towards northeast. There is a considerable variation in trajectory density across different months in the dataset. As a general rule, months with fewer hurricanes, such as June, July and November, require more relaxed reachability conditions (larger double-ellipse and wedge radius), while months with high number of hurricanes, such as August, September and October, require stricter ones.
3.6 Evaluation on Swainson’s Hawks Dataset
3.7 Evaluation on Greek Trucks Dataset
Given the dataset, we find the major highways that are taken by trucks. As previously noted, in the case of two routes merging, the algorithm will find three patterns: two for the routes before they merge, and one for the combined routes after they merge. The patterns in Figures 8 and 8 merge into the pattern in Figure 8. Similarly, the pattern in Figure 8 diverges into the patterns shown in Figures 8 and 8. Finally, the patterns shown in Figures 8, 8, and 8 merge to create pattern shown in Figure 8.
3.8 Evaluation on NGSIM Lankershim Dataset
3.9 Experimental Comparison
Various quantitative measures for evaluating the quality of clusters exist in the literature.  ,,  and  computed Correct Clustering Rate (CCR) as a measure to evaluate clustering results against the ground truth or manual annotations.  used an information theoretic criterion proposed by Meilă , Variation of Information (VI), to validate obtained clusters. VI determines the amount of information lost and gained between two different clusterings of data. Masciari compared inter-class and intra-class similarity of clusters in ,. All of these and similar measures, however, were used to evaluate clusters of entire trajectories. Since this does not align with the goals of our work, an attempt to use the aforementioned metrics would not yield a meaningful comparison. Rather than clustering entire trajectories, the proposed algorithm mines regional trends in trajectory data like , , , ,  and .  evaluates obtained results by visual comparison with GSP  and PrefixSpan .  offers a visual comparison of results with TRACLUS . TRACLUS authors, in turn, state that there is no well-defined measure for density-based clustering methods, and suggest using a metric consisting of sum of squared error and noise penalty to, in authors’ own words, ”get a hint of the clustering quality”. This metric cannot be easily adapted to methods other than variations of TRACLUS. Ferreira , who use the subset of the Atlantic Hurricane Dataset for their evaluations, report that their results are visually consistent with expected hurricane behavior, as does .  provides a visual analysis and suggests a semantic interpretation of their results on the Greek Trucks Dataset. Staying close to the spirit of the works that mine regional trends, we will demonstrate the effectiveness of the proposed method by visually comparing our results with those of TRACLUS , DivCluST ,  and . Due to the space limitation in the paper, we were not able to provide all the visualizations for each of the baseline methods. Instead, we illustrated cases where they could not handle the difficulties provided by the datasets. Compared to the baseline methods, our algorithm can successfully face those challenges. The goal of TRACLUS  is to detect similar portions of trajectories, which semantically correspond to the output of our proposed method. More formally, given a set of trajectories, TRACLUS partitions trajectories, clusters them, and outputs a representative trajectory for each cluster. We used TRACLUS implementation available at . It contains a program that estimates optimum parameter values. According to , these values are approximate and may differ slightly from the true optimal values. Therefore, we ran TRACLUS on each dataset multiple times, tuning each of the parameters. This is not unlike the approach to parameter selection that TRACLUS authors take. Lee 
suggest a heuristic approach to parameter selection, where optimalis selected to minimize the entropy of obtained clusters. Further, the quality of clusters is estimated using the sum of squared error and the noise penalty. Finally,  points out that the final selection of optimal parameters for TRACLUS is made using visual inspection and domain knowledge. One of the challenges that TRACLUS faces, as  also mentions, is that it is not capable of simultaneously extracting both dense and sparse trajectory clusters because its parameters are optimized globally. Modifying its parameters such that it finds sparser clusters leads to redundant clusters in denser regions. Figure 10 shows clusters found by TRACLUS in Vehicle Motion Trajectory, when optimized and are used. We tuned these parameters to allow TRACLUS to extract clusters in both sparse and dense regions, but it results in many short, local clusters and redundant ones shown in Figure 10. Our proposed method can handle variation in density as is illustrated in Figure 5.
DivCluST  is an algorithm that seeks to find regional typical moving styles in the form of mean lines. Performance of DivCluST on the Greek Trucks Dataset is shown in Figure 11 where parameters are optimized according to the method described in the paper. Each arrow represents a mean line where the thickness refers to the frequency of that style and the color corresponds to the speed of that style. Warmer, reddish colors are for faster styles while cooler, bluish colors are for slower styles. This algorithm has trouble when there is large variation in the trajectory density. The Greek Trucks Dataset has this property which causes problems. In the high density regions, the mean lines are very cluttered and overlapping. Due to the Kmeans-type model, there are very similar mean lines. This is because if the random initial clusters are close together in a high density region, a Kmeans-type algorithm will often keep them close together. In the low density regions, there are mean lines that are not representative, such as the mean lines within the yellow box in Figure 11. This is because in low density regions, quite different representative segments are clustered together, producing mean lines that are dissimilar from all the representative segments. Additionally, because the model is restricted to straight mean lines rather than curves, motion that would be better described as a curve is instead required to be described as one long mean line or a sequence of short mean lines. An example of this are the long cyan arrows within the purple box in Figure 11. Our algorithm deals much better with variation in density and has the ability to find curved motion patterns.
Giannotti  proposed an algorithm that seeks to find aggregate motion behaviors from trajectories. These behaviors are defined as sequences of rectangular regions. In Figure 12, some of the aggregated motion behaviors are illustrated where each is represented as a sequence of rectangular black regions. The order of the sequence is shown by the black arrows. First, it is important to note that this algorithm gives very redundant results. Many of the motion behaviors are very similar and some are even subsets of each other. This requires digging through many patterns to find the distinguishable ones. Although there are 80 generated patterns in total for the Vehicle Motion Trajectory Dataset, only 6 representative ones are shown in Figure 12. In other words, of the patterns not shown, each of them has the same shape or is a subset of the shown patterns. Requiring regions to be rectangular restricts the shape of extracted patterns. For instance, in Figure 12, the upper right part of the larger box covers a lane of traffic that should not be included in the pattern. Since the rectangular regions are built only based on density of trajectories without considering motion properties, they are not always suitable to represent motion behaviors. For instance, in Figures 12, 12, and 12, the rightmost box includes a different lane of traffic in the upper portion of the box. Finally, multiple traffic behaviors are sometimes combined into a single pattern. For instance, Figure 12 shows the traffic turning right and going straight combined, Figure 12 shows the traffic going straight and making U-turns combined, Figure 12 shows the access road and incoming traffic from the left combined and Figure 12 shows the U-turn and straight traffic combined. Unlike this method, the approach which we have proposed in this paper does not generate redundant or overlapping motion patterns. Our motion components are more flexible than rectangular regions used in  and our method does not mix multiple patterns of traffic in a motion pattern as can be seen in Figure 5.
Ulm  proposed an algorithm that seeks to find clusters in the form of vector fields defined on a connected spatial set. While this algorithm performs well on datasets that are well structured such as Vehicle Motion Trajectory dataset, it is not well suited for those that lack this property. This is partially due to the fact that it behaves much like the algorithms that cluster trajectories as a whole. It can be seen in Figure 13 that the vector fields can include directions going two opposite ways on a road. This can be avoided by changing the weights, but this results in vector fields that point perpendicular to the roads. This issue is not present in our algorithm since it can find two spatially overlapping motion patterns with different directions. Secondly, the clusters are not homogeneous in movement. Some of the clusters are very sprawling while others have many small, noisy portions. This is not the effect of using too few clusters since there is a large amount of overlap among them. Instead, this is the effect of using whole trajectories to create vector fields.
4 Implementation Details
In this section, the parameters will be discussed. This algorithm includes 16 parameters in total that influence the output of the method. The intuitive interpretation of each of the parameters will be discussed and an approach for fine-tuning them in independent groups will be described.
The first parameter is the number of clusters , which determines the number of motion components. It is important for the total number of motion components to be large in order to have a good resolution for the patterns. However, if the number is too large, some of the motion components may end up with only a few flow vectors. These motion components may be dramatically affected by noise and in some cases even form false motion patterns. It must also be noted that clustering is a computationally intensive task, hence the selection of will also be influenced by the desired running time. Generally, datasets that are not expected to contain much noise will allow for arbitrarily large values of .
The second parameter, , is also used in clustering. This parameter controls how much the velocity of the flow vector affects the formation of motion components. If is too small, the spatial proximity will be the primary factor affecting the formation of the motion components, and each motion component’s flow will be the average flow of its constituent flow vectors. Whenever this is the case, the algorithm will likely miss overlapping motion patterns with different flow directions, such as those seen in the Swainson’s Hawks dataset. The larger becomes, the larger role the flow direction plays in clustering. A large will make the algorithm sensitive to smaller deviations in flow direction, as well as noise. Extremely large will cause the flow direction to bias the clustering process, forming motion components from vectors with similar flow directions and dissimilar spatial coordinates, as illustrated in 14, where is set to 1000, which is much higher than equal to 45, used for other results. In general, should be as small as possible while still allowing any meaningful overlap of patterns to exist. These two parameters completely determine the formation of motion components. Thus, these parameters can be fine-tuned without running the entire model but only visualizing the motion components. This will make the parameter selection process far simpler.
The next parameters are , , , and . These parameters determine the shape and size of the double-ellipse. In general, the double-ellipse should be small enough to avoid denoting motion components as reachable that belong to different motion patterns, yet large enough to jump gaps in the motion component distribution within motion patterns. Example in 14 illustrates the outcome of setting the and of the ellipses too high. The ”wider” ellipses jump the gap between lanes. Similar logic applies to manipulating , values. The parameter should be chosen in a similar way except that this parameter deals with the flow direction rather than spatial position. should be large enough to capture deviations within motion patterns but not so large to make motion components reachable which belong to different motion patterns, as is demonstrated in 14, where was set to 120 from its usual 12. Too low value, on the other hand, will yield results where even small differences in flow direction will make motion components unreachable. This effect is demonstrated in Figures 14, 14 and 14, where the right turn is spit into three motion patterns.
The parameter also deals with the flow direction of motion components. If this parameter is high, the algorithm expects well-formed curves, and if it is small, the algorithm will look for similar flow directions between reachable motion components rather than well-formed curves. If this parameter is set too high, the algorithm will be sensitive to noisy data and find curves where there are none. On the other hand, if it is set too low, the algorithm will have trouble finding curves - leading to the same problem as the one demonstrated in 14, 14 and 14. Because the algorithm expects curves, it sometimes connects two motion components with very different flow directions. To avoid this situation, a threshold is placed on the absolute value of . In general, this will be two or three times the value of .
The parameters so far discussed determine the core of the reachability relation. Among the reachability parameters, search distance deals with the unblocking process and the wedge parameters allow semi-lateral motions. To test this core reachability, the path reachable motion components and signature for some motion components can be computed with discounting the semi-lateral motion and unblocking process by setting the wedge parameters to 0 and search distance to 1. Then the signature for a motion component can be plotted and evaluated. However, for some datasets, these signatures will be proper for most motion components, but for other motion components, the signatures will unexpectedly end because of blocked motion components. To fix this, the search distance can be increased. However, if search distance is too large, the algorithm will make unreasonably far away motion components reachable. Once again, search distance value can be evaluated by visualizing the signatures of a sample of the motion components.
Additionally, the wedge parameters can be set to nonzero values to include semi-lateral reachability and improve the signatures. In some datasets, the signatures will not cover the entire width of motion patterns, as in Figure 15, where a signature for a motion component is plotted. In these cases, the wedge should be used to improve the results. should be just high enough to include the width of motion patterns with somewhat noisy flow direction. should be just high enough to include gaps within the width of the motion pattern. should be high enough to expand a signature to the width of the motion pattern within a short distance, as demonstrated in Figure 15. An example of wedge parameters at work is given in figure 14, where = 90, = 20, = 15, and the algorithm recovers all three parallel southbound lanes as one motion pattern. This concludes the reachability parameters that can be fine-tuned and tested independently by looking at the signatures of some of the motion components. The impact of wedge parameters can be examined by visualizing signatures. Finally, there is one parameter for the clustering of motion components. The cutoff value, determines the size of the motion patterns. The larger the cutoff, the larger the motion patterns will be.
In many applications, the task of automatically detecting changes in trends is just as interesting as uncovering existing ones. For example, in traffic control, sudden disappearances of motion patterns may indicate lane or street closures or blockages due to accidents, fallen trees, flooding, and other causes. Changes in motion patterns of animals may be indicative of changes in their environment, such as those due to urban expansion or pollution. These changes are of particular interest to conservation biologists. Furthermore, zoologists are frequently interested in analyzing seasonal changes in movement of animals. In the domain of commerce, merchandisers and marketing professionals can use changes in shopper traffic patterns to improve advertisement and product visibility. We can mine changes in trends of trajectory data by either discovering newly emerged motion patterns or by finding motion patterns that no longer occur. Given trajectories observed from time to and from time to , we generate two sets of motion patterns, and respectively, using our proposed algorithm. contains motion patterns where and contains motion patterns where . Then, for any , using Kullback-Leibler (KL) divergence, we try to find a similar motion pattern in . If no match is found, it means that is a newly emerged motion pattern, a trend that was not previously observed in to interval. Repeating the process for all , we can detect the disappearance of a motion pattern if no match is found in
. In order to be able to use KL divergence for comparing two motion patterns, we need to have their probability distribution functions. The pdf ofand denoted as and respectively, can be obtained by learning mixtures of Gaussians from flow vectors that are contained by their constituent motion components. Then we draw a sufficient number of samples from and evaluate probability of their occurrence in . A high probability means that two motion patterns are similar as their KL divergence is low. has employed this approach to merge large numbers of similar motion patterns that occur at different times together while  performs event classification by matching the distribution of motion patterns that minimizes KL divergence.
In this paper, we considered the general task of trajectory clustering using a novel approach inspired by the motion pattern idea. Our method consists of four main steps. First, we break down trajectories into flow vectors and then, using Kmeans clustering, we extract motion components. In the third step, we use the double-ellipse and the wedge conditions in addition to unblocking procedure to find reachable pairs of motion components. Finally, using the path reachability and signature concepts, we form motion patterns via agglomerative clustering with the weighted Jaccard distance between motion components’ signatures. We evaluated our proposed method on five different datasets. Experimental results indicate that our motion pattern approach gives an effective solution to the general task of trajectory clustering regardless of the dataset properties. Extracted motion patterns closely fit the annotations, prevailing paths, or descriptions of trajectory datasets where available. We comprehensively discussed the effects of model parameters and provided a selection process for them. Also, we noted that the actual optimum set of parameters will rely on domain knowledge as well as specific analytical goals. In addition, we discussed how our proposed model is well suited for automatically detecting changes in frequent behaviors of trajectories over time.
Overall, we believe that we have provided a new approach for understanding trajectory behavior. Its output is comparable to the output of those trajectory clustering methods that look for regional similarities among trajectories. It is capable of handling a variety of challenges provided by different datasets. Our proposed method can provide data analysts a good starting point for understanding the hidden behavioral patterns in enormous and complex trajectory datasets.
-  S. Atev, G. Miller, and N. P. Papanikolopoulos, “Clustering of vehicle trajectories,” Trans. Intell. Transport. Sys., vol. 11, no. 3, pp. 647–657, Sep. 2010.
Z. Fu, W. Hu, and T. Tan, “Similarity based vehicle trajectory clustering and anomaly detection,” inImage Processing, 2005. ICIP 2005. IEEE International Conference on, vol. 2. IEEE, 2005, pp. II–602.
B. Morris and M. Trivedi, “Learning trajectory patterns by clustering:
Experimental studies and comparative evaluation,” in
Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, 2009.
-  W. Hu, D. Xie, Z. Fu, W. Zeng, and S. Maybank, “Semantic-based surveillance video retrieval,” Image Processing, IEEE Transactions on, vol. 16, no. 4, pp. 1168–1181, 2007.
F. Bashir, A. Khokhar, and D. Schonfeld, “Object trajectory-based activity classification and recognition using hidden markov models,”Image Processing, IEEE Transactions on, vol. 16, no. 7, pp. 1912–1919, 2007.
-  E. J. Keogh and M. J. Pazzani, “Scaling up dynamic time warping for datamining applications,” in Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2000, pp. 285–289.
-  D. Buzan, S. Sclaroff, and G. Kollios, “Extraction and clustering of motion trajectories in video,” in Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on, vol. 2. IEEE, 2004, pp. 521–524.
-  C. Piciarelli and G. L. Foresti, “On-line trajectory clustering for anomalous events detection,” Pattern Recognition Letters, vol. 27, no. 15, pp. 1835–1842, 2006.
-  S. Atev, O. Masoud, and N. Papanikolopoulos, “Learning traffic patterns at intersections by spectral clustering of motion trajectories,” in Intelligent Robots and Systems, 2006 IEEE/RSJ International Conference on. IEEE, 2006, pp. 4851–4856.
-  B. Morris and M. Trivedi, “An adaptive scene description for activity analysis in surveillance video,” in Pattern Recognition, 2008. ICPR 2008. 19th International Conference on. IEEE, 2008, pp. 1–4.
D. Biliotti, G. Antonini, and J. P. Thiran, “Multi-layer hierarchical clustering of pedestrian trajectories for automatic counting of people in video sequences,” inApplication of Computer Vision, 2005. WACV/MOTIONS’05 Volume 1. Seventh IEEE Workshops on, vol. 2. IEEE, 2005, pp. 50–57.
-  G. Karypis, E.-H. Han, and V. Kumar, “Chameleon: Hierarchical clustering using dynamic modeling,” Computer, vol. 32, no. 8, pp. 68–75, 1999.
-  X. Li, W. Hu, and W. Hu, “A coarse-to-fine strategy for vehicle motion trajectory clustering,” in Pattern Recognition, 2006. ICPR 2006. 18th International Conference on, vol. 1. IEEE, 2006, pp. 591–594.
N. Ferreira, J. T. Klosowski, C. Scheidegger, and C. Silva, “Vector field k-means: Clustering trajectories by fitting multiple vector fields,” inEuroVis, 2013.
-  H. ru Wu, M.-Y. Yeh, and M.-S. Chen, “Profiling moving objects by dividing and clustering trajectories spatiotemporally,” IEEE Transactions on Knowledge and Data Engineering, vol. 99, no. PrePrints, 2012.
-  M. Ulm and N. Brandie, “Robust online trajectory clustering without computing trajectory distances,” in Pattern Recognition (ICPR), 2012 21st International Conference on. IEEE, 2012, pp. 2270–2273.
-  F. Giannotti, M. Nanni, F. Pinelli, and D. Pedreschi, “Trajectory pattern mining,” in Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2007, pp. 330–339.
-  J. gil Lee and J. Han, “Trajectory clustering: A partition-and-group framework,” in In SIGMOD, 2007, pp. 593–604.
-  J.-G. Lee, J. Han, X. Li, and H. Gonzalez, “Traclass: trajectory classification using hierarchical region-based and trajectory-based clustering,” pp. 1081–1094, 2008.
-  B. D. Lucas, T. Kanade et al., “An iterative image registration technique with an application to stereo vision.” in IJCAI, vol. 81, 1981, pp. 674–679.
-  R. Gurka, A. Liberzon, D. Hefetz, D. Rubinstein, and U. Shavit, “Computation of pressure distribution using piv velocity data,” in Workshop on Particle Image Velocimetry, 1999.
-  I. N. Junejo, O. Javed, and M. Shah, “Multi feature path modeling for video surveillance,” in Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on, vol. 2. IEEE, 2004, pp. 716–719.
-  M. Hu, S. Ali, and M. Shah, “Detecting global motion patterns in complex videos,” in Pattern Recognition, 2008. ICPR 2008. 19th International Conference on. IEEE, 2008, pp. 1–5.
-  D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 24, no. 5, pp. 603–619, 2002.
-  A. Basharat, A. Gritai, and M. Shah, “Learning object motion patterns for anomaly detection and improved object detection,” in Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on. IEEE, 2008, pp. 1–8.
-  V. Lasdas, R. Timofte, and L. Van Gool, “Non-parametric motion-priors for flow understanding,” in Applications of Computer Vision (WACV), 2012 IEEE Workshop on. IEEE, 2012, pp. 417–424.
-  I. Saleemi, K. Shafique, and M. Shah, “Probabilistic modeling of scene dynamics for applications in visual surveillance,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 31, no. 8, pp. 1472–1485, 2009.
-  M. Hu et al., “Learning motion patterns in crowded scenes using motion flow field,” in 2008 19th International Conference on Pattern Recognition, 2008, pp. 1–5.
-  S. Baker and I. Matthews, “Lucas-kanade 20 years on: A unifying framework,” International Journal of Computer Vision, vol. 56, no. 3, pp. 221–255, 2004.
C. E. Rasmussen, “Gaussian processes for machine learning,” 2006.
-  X. Zhao and G. Medioni, “Robust unsupervised motion pattern inference from video and applications,” in Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011, pp. 715–722.
-  Y. Peng, A. Ganesh, J. Wright, W. Xu, and Y. Ma, “Rasl: Robust alignment by sparse and low-rank decomposition for linearly correlated images,” in Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010, pp. 763–770.
-  J. Prokaj, M. Duchaineau, and G. Medioni, “Inferring tracklets for multi-object tracking,” in Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on. IEEE, 2011, pp. 37–44.
-  J. Prokaj and G. Medioni, “Using 3d scene structure to improve tracking,” in Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011, pp. 1337–1344.
-  P. Mordohai and G. Medioni, “Dimensionality estimation, manifold learning and function approximation using tensor voting,” The Journal of Machine Learning Research, vol. 11, pp. 411–450, 2010.
-  I. Saleemi, L. Hartung, and M. Shah, “Scene understanding by statistical modeling of motion patterns,” in Computer Vision and Pattern Recognition (CVPR), IEEE Conference on, 2010, pp. 2069–2076.
-  “Vehicle Motion Trajectory Dataset,” http://18.104.22.168/dataset/trajectory/dataInfo.htm.
-  “Atlantic Hurricane Dataset (HURDAT2),” http://www.nhc.noaa.gov/data/hurdat.
-  M. N. Kochert, M. R. Fuller, L. S. Schueck, L. Bond, M. J. Bechard, B. Woodbridge, G. L. Holroyd, M. S. Martell, and U. Banasch, “Migration patterns, use of stopover areas, and austral summer movements of swainson’s hawks,” The Condor, vol. 113, no. 1, pp. 89–106, 2011.
-  “Movebank Data Repository,” http://www.Movebank.org.
-  “The Greek Trucks Dataset,” http://www.chorochronos.org.
-  “The Next Generation SIMulation-Lankershim Boulevard Dataset,” http://ngsim.fhwa.dot.gov/.
-  Y. Yang, Z. CUI, J. WU, G. ZHANG, and X. XIAN, “Trajectory analysis using spectral clustering and sequence pattern mining,” Journal of Computational Information Systems, vol. 8, no. 6, pp. 2637–2645, 2012.
-  Z. Zhang, K. Huang, and T. Tan, “Comparison of similarity measures for trajectory clustering in outdoor surveillance scenes,” in Pattern Recognition, 2006. ICPR 2006. 18th International Conference on, vol. 3. IEEE, pp. 1135–1138.
-  E. Ricci, F. Tobia, and G. Zen, “Learning pedestrian trajectories with kernels,” in Pattern Recognition (ICPR), 2010 20th International Conference on. IEEE, 2010, pp. 149–152.
M. Meilă, “Comparing clusterings—an information based distance,”
Journal of Multivariate Analysis, vol. 98, no. 5, pp. 873–895, 2007.
E. Masciari, “A complete framework for clustering trajectories,” in
Tools with Artificial Intelligence, 2009. ICTAI’09. 21st International Conference on. IEEE, 2009, pp. 9–16.
-  ——, “Finding homogeneous groups in trajectory streams,” in Proceedings of the Third ACM SIGSPATIAL International Workshop on GeoStreaming. ACM, 2012, pp. 11–18.
-  J. Kang and H.-S. Yong, “Mining spatio-temporal patterns in trajectory data.” JIPS, vol. 6, no. 4, pp. 521–536, 2010.
-  Y. Zhang and D. Pi, “A trajectory clustering algorithm based on symmetric neighborhood,” in Computer Science and Information Engineering, 2009 WRI World Congress on, vol. 3. IEEE, 2009, pp. 640–645.
-  R. Srikant and R. Agrawal, Mining sequential patterns: Generalizations and performance improvements. Springer, 1996.
-  J. Han, J. Pei, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M. Hsu, “Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth,” in proceedings of the 17th international conference on data engineering, 2001, pp. 215–224.
-  B. Guan, L. Liu, and J. Chen, “Using relative distance and hausdorff distance to mine trajectory clusters,” TELKOMNIKA Indonesian Journal of Electrical Engineering, vol. 11, no. 1, pp. 115–122, 2013.
-  “Traclus Implementation,” http://dm.kaist.ac.kr/jaegil/Publications.
-  S. Khokhar, I. Saleemi, and M. Shah, “Similarity invariant classification of events by kl divergence minimization,” in Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011, pp. 1903–1910.