1 Introduction
The concept of coherence in fluid flows has historically been used to delineate packets of fluid elements that persist while the flow evolves without significant mixing with the surrounding fluid regions. Coherence can frequently be visualized qualitatively by observing the evolution of passive tracers in a flow (e.g. Haller (2015), Huhn et al. (2012)). However, mathematical frameworks are needed to quantify such structures objectively. Eulerian techniques for coherent structure identification include the qcriterion (Hunt et al., 1988), criterion (Jeong & Hussain, 1995), and the OkuboWeiss parameter (Okubo, 1970; Weiss, 1991). All of these methods are framedependent, however. Frame invariance is an important characteristic of a method for determining coherent structures; if a method identifies a structure boundary in one frame of reference, but not in another (for example, in a rotating reference frame), then the method may not be selfconsistent in its characterization of fluid coherence (Haller, 2005).
As an alternative, Lagrangian techniques have also been developed, many based on analysis of the deformation gradient tensor of the flow field
(Haller & Yuan, 2000; Shadden et al., 2005). These methods use information regarding the trajectories of fluid particles, as opposed to their velocity or acceleration, which ensures frame invariance since relative particle position does not depend on the reference frame in which it is measured.Despite the vast array of current applications, identification of coherent structures based on the deformation gradient has some limitations as well. For instance, knowledge of a discretized version of the entire flow field is typically required, of the sort obtained from computational analysis or particle image velocimetry (PIV). However, some common empirical tools for fluid flow measurement do not provide velocity data in the whole flow field. One such technique, particle tracking velocimetry (PTV), (e.g. Chang et al. (1984), Racca & Dewey (1988)), is particularly useful in applications where velocity data in three dimensions is required, or where the entire flow cannot be densely and uniformly seeded, as in studies of ocean currents. Particle tracking algorithms often result in much sparser velocity measurements than techniques such as PIV. For example, Davis (1991) reviews a wide range of studies utilizing artificial ocean drifters to study nominally twodimensional ocean surface flows, where the number of drifters ranges from 14 to 300 per study. In three dimensions, a number of PTV studies have utilized between 800 and 5000 particle trajectories (Virant & Dracos, 1997; Lüthi et al., 2005; Murai et al., 2007; Kim et al., 2013). Recently, several PIV and PTV techniques have been developed to increase the density of data collected (Elsinga et al., 2006; Schanz et al., 2016). Despite these advances, in many situations seeding the flow with a large number of particles can be extremely difficult, e.g. in situ measurements of atmospheric and oceanic flows that rely on naturally occurring particulate fields (Ho et al., 2014; Katija & Dabiri, 2008). Hence, the need for techniques to analyze sparse data persists.
Deformationgradient based methods for identification of coherent structures fail for sparse trajectory data due to the assumptions inherent in analyzing the deformation gradient tensor, , where maps the initial location of a fluid element, , to its location at a later time. The principal assumption that is no longer satisfied is the initial close separation of flow trajectories, since the trajectory spacing cannot be controlled a priori. Moreover, the determination of the finite time Lyapunov exponent (FTLE) field requires linearization of the flow map (Haller & Yuan, 2000), which also breaks down for wellspaced flow trajectories. Therefore, an alternate approach is needed to extract coherent structures from sparse velocity data.
Several methods have been developed recently in an attempt to address this issue, a number of which are reviewed by Allshouse & Peacock (2015). One such method is based on braid theory (Allshouse & Thiffeault, 2012), which maps two dimensional fluid particle trajectories into three dimensions, where time is the third dimension. Plotted in this way, the entwinement of trajectories with each other can be analyzed, and surfaces surrounding sets of trajectories that do not grow with time can be identified, indicating the presence of coherent structures. While the braid method is useful for twodimensional datasets, its extension to higher spatial dimensions has not yet been achieved. Additionally, the braidbased analysis can become quite complex and computationally expensive for large numbers of trajectories.
A second method developed for use with sparse data sets is the clusterbased approach by Froyland & PadbergGehle (2015)
, and it is well suited for PTV datasets due to its ability to handle both sparse and incomplete fluid particle trajectories. The method uses the Euclidean distance from each particle to the center of each of a predetermined number of clusters to assign to each fluid particle a probability of belonging to each cluster while simultaneously determining the location of the cluster centers. This is accomplished using the iterative fuzzy cmeans algorithm developed for use in clustertheory
(Bezdek et al., 1984). While the authors proved that this method can accurately detect coherent structures from a variety of benchmark flows, in addition to global ocean drifter data, it has the distinct disadvantage that the number of clusters be known a priori.In order to address the need to determine the number of clusters a priori in the clusterbased approach of Froyland & PadbergGehle (2015), Hadjighasem et al. (2016)
recently developed a method based on spectral graph theory. This method relies on the concept of a graph, or a set of nodes connected by edges, where in this case, the nodes represent Lagrangian particles, and the edges are weighted by the inverse of the average distance between particle pairs. By examining the smallest magnitude eigenvalues associated with the generalized eigenvalue problem of the graph Laplacian, the authors developed a heuristic for determining the number of coherent structures in the flow. The authors use this information as input to the Kmeans clustering approach to determine the centers of the coherent structures in the flow. The testcases used by the authors in validating this approach demonstrate its effectiveness in identifying coherent structures without knowing the number of clusters a priori. However, for most flows analyzed, the number of trajectories used, on the order of tens of thousands, far exceeds the number of trajectories available for most PTV and ocean drifter datasets. It is also important to note that the method for determining the number of coherent structures is heuristic and therefore difficult to generalize. Other graph theorybased methods have also been developed recently to address the issues associated with current cluster and braid based approaches
(Banisch & Koltai, 2016).We propose an alternate graph theorybased metric that weights the edges not by the distance between corresponding particles, but by a metric of kinematic dissimilarity, regardless of spatial proximity. This method is frame invariant because it does not consider particle velocity, only the spatial location of each fluid particle relative to other particles in the flow. In analogy to graph coloring algorithms that partition nodes with large connecting edge weights, the present method solves an eigenvalue problem to partition fluid particle trajectories according to their kinematic dissimilarity. This approach can be considered an application of spectral graph drawing, which uses eigenvectors of matrices associated with a graph to visualize certain characteristics of the graph
(Koren, 2005). The present method results in a coherent structure coloring (CSC) field, where similar values of CSC indicate regions of the flow that are coherent, according to the present definition. In this way, all coherent structures in the flow can be identified simultaneously, without prior knowledge of their number. Methods for extracting individual coherent structures from the CSC field are discussed. The method was tested using three validation cases, including two canonical analytic flows: an oscillating quadruplegyre and a Bickley jet; and one experimental dataset: a twodimensional crosssection of a high strokeratio vortex ring.The following section details the mathematical derivation of the algorithm used for coherent structure identification. Section 3 presents the results of the three test cases described above and characterizes the sensitivity of the method to certain parameters, including the number and initial location of the particles. Section 4 compares the results of the CSC method to the results of other graph theorybased methods. Section 5 summarizes the results of the study and provides avenues for future development.
2 Methods
Coherence, defined here as the kinematic similarity of Lagrangian fluid trajectories, regardless of their spatial proximity, can be identified in flows with arbitrary timedependence using a graph theorybased approach. The graph is defined as the superset , where represents the set of nodes in the graph, is the set of edges connecting the nodes, and are the weights corresponding to the edge set. Assuming that the trajectories of a set of Lagrangian fluid particles is known at
time steps, a graph representing the flow can be constructed, wherein each node represents a fluid particle. Unlike previous methods that weight the edges of such a graph based on the proximity of the fluid trajectories (e.g. using Euclidian distance), here we use a weight based on kinematic dissimilarity. We hypothesize, and later demonstrate, that coherent structures can be identified more robustly by quantifying the extent to which fluid trajectory kinematics are different, rather than the extent to which fluid particle trajectories remain in proximity over time. To this end, each edge, representing the connection between a pair of particles, is weighted by the standard deviation of the distance between the two fluid trajectories over their duration, normalized by the average distance between the fluid particle trajectories during the same period. The edge weights can be represented numerically using the weighted adjacency matrix
, where contains the weight of the edge connecting particle and particle :(1) 
where is the distance between two particles and at time , and is the average distance between the two fluid particle trajectories.
Graph coloring is a labeling of nodes in a graph such that node pairs with large edge weights are assigned dissimilar values (Muñoz et al., 2005). This makes graph coloring a natural tool for coherent structure identification based on the kinematic dissimilarity metric in equation 1. We pose this as the onedimensional problem of maximizing
(2) 
where is the number of rows and columns in the weighted adjacency matrix (i.e. the number of particles) and
is a row vector containing the value of coherent structure coloring (CSC) associated with each particle. By maximizing
we are in effect determining CSC values such that fluid particle trajectories that are kinematically dissimilar (i.e. where the weight of the edge between them is large) are assigned CSC values that are as different as possible. Following Hall (1970), we define the degree matrix , which contains the row sums of the adjacency matrix along the diagonal, as(3) 
We also define the graph Laplacian, . The maximization problem can then be manipulated as follows:
(4)  
(5)  
(6)  
(7) 
In order to avoid the trivial case where , and to bound to finite values, the constraint is imposed (another finite constraint can be imposed without loss of generality). Given this constraint, the maximization problem can be written in Lagrangian form as . As a necessary condition for to be a local maximum, , yielding , which can be written as
(8) 
which is the generalized eigenvalue problem for the graph Laplacian. The generalized eigenvector that maximizes is the eigenvector corresponding to the maximum eigenvalue of equation 8 (Hall, 1970). Each element of
assigns that value of CSC to the corresponding fluid particle at the final time of the interval over which particle trajectories were compared. The CSC vector can be mapped to the space of the original flow with arbitrary dimensionality based on the spatial locations of the particles. Interpolation between the particles facilitates construction of the corresponding CSC field. Thus, regions in the flow with a similar value of coherent structure coloring indicate coherence.
3 Results
The effectiveness of the coherent structure coloring algorithm is demonstrated using three example flows. The first, a quadruple gyre, is an extension of the double gyre, which is frequently used in vortex detection algorithm verification (Allshouse & Peacock, 2015; Froyland & PadbergGehle, 2015). Both the steady and the unsteady cases are examined. The second verification case is the Bickley jet, which introduces complexities due to the presence of multiple coherent vortices as well as a meandering jet. Finally, we apply the CSC method to sparse trajectories derived from a particle image velocimetry (PIV) dataset of a long stroke ratio vortex ring, where secondary and tertiary rings in addition to a trailing jet form behind the primary vortex ring. This shows the robustness of the algorithm to errors associated with experimental data. For the two analytical validation cases, a fifthorder, RungeKutta method was used to determine fluid trajectories. For the PIV dataset, particle velocities were determined using linear interpolations between velocity vectors and particles were advected using the Euler method.
3.1 Quadruple Gyre
First, we examine the characteristics of the coherent structure coloring algorithm using the analytical quadruple gyre flow. This flow is defined by
(9)  
(10) 
where and are the spatial coordinates, is time, and
(11) 
Here we examine two parameter cases: the steady case where and ; and the unsteady case where , , and . Figure 1 shows the velocity vector field, streamlines, and coherent structure coloring for the steady quadruple gyre, tracking only 300 particles over 40 time units.
CSC is able to clearly delineate the four quadrants of the flow, and assigns a high value to the gyre centers. Gyres in opposite corners have approximately the same value of coloring, due to their identical rotational orientation (clockwise in the upper left and lower right, and counterclockwise in the upper right and lower left quadrants). This is a result of having a measure of coherence that does not conflate kinematic similarity and physical proximity. The latter does not necessarily imply the former, and vice versa. Large weights in the present adjacency matrix correspond to fluid particles that are kinematically dissimilar, and given that the weights correspond to the standard deviation of the distance between two particles divided by their average distance, the mean distance between particles and the standard deviation in their distances both contribute to coherence as defined by this algorithm.
The flow becomes significantly more complex when periodic oscillatory unsteadiness is introduced. When comparing the coherence coloring to the FTLE of the same flow computed using 65,000 particles, seen in figure 2(c), it is clear that the largest positive value of coherence coloring correctly identifies all four vortex cores. The negative values of CSC correspond with the regions in which particles have switched quadrants due to the oscillatory nature of the flow, as indicated by the red dots in figure 2(a).
In this case, the largest kinematic dissimilarity in the flow is between those particles that remain near the center of the quadrant in which they started, and those particles that switch quadrants. This is highlighted by the particle traces shown in figure 2(b), which shows the trajectories of the particles with the largest positive value of coloring (in black) and those with the largest negative value of coloring (in red). This result is in contrast to the steady case, in which the sign of vortex rotation was the predominant distinguishing feature. Notably, the CSC algorithm can be applied recursively to the subset of particles with similarly high values of CSC in figure 2(a), in order to recover the vortex orientation information in figure 1(c). This is demonstrated in figure 3.
Here we see that for an initial CSC field calculated with 3000 particles, all four gyre centers are identified with a high CSC value. In order to detect more subtle differences between the four gyre cores, a threshold value of is set, indicated by the solid black lines surrounding the gyre cores in 3(a). When the CSC algorithm is applied using only the particles exceeding this threshold value, the information distinguishing the gyres that rotate clockwise from those that rotate counterclockwise (as in the steady quadruple gyre case) is recovered.
For this dataset, a clusterbased method using fuzzy cmeans clustering would correctly identify the four quadrants of the steady quadruple gyre as four separate coherent structures, if it is assumed a priori that there are four clusters present. For the unsteady case, assuming the presence of four structures, the four gyre cores would again be correctly identified using this method. However, if more than four coherent structures are assumed, the clustering method will detect the four cores and a number of additional structures, some of which may correspond to fluid parcels that did not switch quadrants but are not adjacent to the gyre cores (Allshouse & Peacock, 2015). A braid based analysis for this flow would likely identify eight structures, again assuming an extension of the results of the double gyre system in Allshouse & Peacock (2015). In summary, the cluster based method requires a priori knowledge of the number of structures present, and the braid based analysis cannot be easily extended to three dimensions and is computationally expensive. The CSC method addresses all of these issues.
3.2 Bickley Jet
The Bickley jet, another analytical example, is frequently used as a model of zonal jets in the Earth’s atmosphere (Rypina et al., 2007). It is a periodic flow comprising a spatially undulatory jet with counter rotating vortices above and below. The flow is described by the stream function , where
(12)  
(13) 
We use similar values of the parameters as in Hadjighasem et al. (2016): ms, km, , , , , , and , , , and the flow is computed on the interval , m, , m, over the time interval , days, divided into 607 discrete time steps. The flow was considered periodic in . For calculation of the CSC, particles were initialized randomly in the domain and advected with the flow. The particles were followed over the entire time interval, even if they left the domain, analogous to how ocean drifters are tracked. The velocity magnitude of the flow overlaid with streamlines is shown in 4, along with the FTLE field calculated using 48,000 particles.
Figure 5 shows the results of the coherent structure coloring algorithm using only 480 particles: (a) shows the CSC field overlaid with black dots indicating the final particle positions, and (b) shows particle tracks for the highest positive coloring values (in black) and the highest negative coloring values (in red).
Without specifying the number of coherent structures a priori, the algorithm is able to accurately detect the centers of the three vortices above the jet and two full and two half vortices below. However, if the seeding density was too low, such that one of the vortices contained no particles, or only a few, the structure could not be identified. The jet itself is aligned with the most negative coloring contours, indicating that the largest kinematic dissimilarity detected is between the jet and the vortices flanking it. It is noteworthy that the eigenvector associated with the largest eigenvalue of the generalized eigenvalue problem (i.e. the CSC) provides information about all of the coherent structures simultaneously, as opposed to cut approaches that recover one structure per eigenvalue among those selected heuristically. This is in part because the present algorithm avoids unnecessarily restricting the definition of coherence to particles that are physically close; even the relative kinematics of particles that are far apart provides useful information regarding the coherent structures in the flow.
The Bickley jet flow can also be used to characterize the robustness of the CSC method to broken or incomplete fluid particle trajectory information. Two types of incomplete datasets are examined, each corresponding to a specific experimental circumstance in which data might be lost. First, we examine the case in which a fluid particle trajectory is lost and later recovered, but is still identified as the same particle as before the data loss. This could be the case for ocean drifters, e.g. when a drifter temporarily goes offline with associated data loss. Additionally, certain PTV algorithms are capable of linking broken trajectories if data sets are sparse enough and particle trajectory crossings do not occur when breaks occur (Li et al., 2008). To characterize the response of the CSC method to this type of data loss, a dataset containing full trajectories of 480 particles was manipulated to randomly remove 10%, 50%, and 90% of the trajectory data. When considering the weight of a pair of particles, the CSC algorithm only considers the overlapping time interval in which both particles are present. If a particle is not present during the time step for which the CSC field is computed, the trajectory information for that particle is not considered in the calculation of the adjacency matrix. Hence, the size of the adjacency matrix is , where is the number of particles present in the domain during the time step in which the CSC field is calculated. For this analysis, it was ensured that all 480 particles were present in the final time step so that their trajectory information could be used to calculate the CSC field. This was done so that the analysis would show the effect of intermittent data loss as opposed to the effect of total number of particles. The results are shown in figure 6.
From this figure it is evident that in the case where pieces of particle trajectories can be linked and identified as broken pieces of the same trajectory, intermittent data loss does not adversely affect the robustness of the CSC algorithm, as long as there are a sufficient number of particles present in the time step for which the CSC field is calculated.
In order to characterize the CSC method in cases where broken trajectories cannot be reconstructed and the particle tracks must be treated as independent fluid particles, a baseline group of 480 particles was again examined. A portion of the position data was then randomly deleted, and every time a break in the position data occurred, the remainder of the track was recharacterized as a separate particle trajectory. The results from this analysis are shown in figure 7.
For the case with 480 unbroken particle trajectories, shown in figure 5(a), all particles have a trajectory length that spans the entire time domain. For the shorter particle trajectories shown in figure 7, it is evident that as the average particle trajectory length is shortened, the flanking vortices appear to blend together into two large coherent structures, one below the jet and one above. This result can be understood by considering the length of the fluid particle trajectories relative to the eddy turnover time. Based on the flow velocity around closed streamlines for the Bickley jet flow at
, the eddy turnover time is estimated to be approximately
s, and the time interval , s is equivalent to approximately 12.4 eddy turnover times. Thus, while it is clear that the CSC algorithm is capable of handling broken trajectories of this type, an average trajectory length of approximately 2.6 eddy turnover times, as in figure 7(a), is necessary to distinguish individual coherent structures. Otherwise, there is not enough information is available to effectively characterize the flow, even if the total observation time is a larger multiple of the eddy turnover time.It is also useful to analyze and understand the response of the method to a large number of particles, approaching the quantity used for nonsparse methods such as FTLE analysis. As such, a CSC analysis of a Bickley jet seeded with 12000 particles was examined, and the resulting CSC field is shown in figure 8.
The features of this CSC field are similar to what would be seen by clusteringbased methods, if thresholding of the CSC were used to separate the vortex cores into distinct structures. There are also similarities with what would be seen if vortices were extracted from the flow using the forward and reverse FTLE fields, including the five isolated vortex cores and two half cores. Although not demonstrated here, the subsequent extraction of the coherent structures from the CSC field can be performed in a manner similar to the FTLE field analysis. For example, one option is to use thresholding of the CSC field to determine boundaries of the coherent structures. Additionally, the spatial gradients of the CSC field can be used to separate coherent structures from the background flow.
In assessing the feasibility of high resolution CSC analysis, computational time is an important factor to consider. Table 1 provides a summary of computational run times on a single processor for the Bickley jet with three seeding densities.
480 particles  2400 particles  12000 particles  
adjacency matrix calculation  2.8 s  79.1 s  2345.2 s 

eigendecomposition  0.3 s  1.9s  213.2 s 
3.3 Vortex Ring
Next, we examine a PIV dataset of a forming vortex ring with a high maximum stroke ratio. The vortex ring is created in a water tank using a piston forced at speed a distance though a hollow cylinder of diameter , which in turn ejects fluid from an axisymmetric nozzle with a sharp edge. The shear layer formed inside the nozzle due to the motion of fluid through it rolls up at the nozzle exit forming a vortex ring. If the maximum stroke ratio, , where is the total time over which the piston is moving, is greater than 4, then a trailing jet and potentially secondary and tertiary vortex rings are formed behind the primary vortex ring (Gharib et al., 1998). The dataset examined here is of a vortex ring formed using a piston traveling with a constant velocity until a maximum stroke ratio of 8. The Reynolds number based on the diameter is approximately 1800. Details of the experimental setup and acquisition and processing of the PIV images for this dataset can be found in SchlueterKuck & Dabiri (2016).
Figure 9 (a) shows the vorticity field at , after the piston has stopped moving, where is the nondimensional time, equivalent to the number of piston diameters that the piston has traveled.
At this point, the leading vortex ring, as well as secondary and tertiary vortex rings and a trailing jet, are clearly visible. The corresponding backward FTLE field, computed using 30,500 particles is shown in figure 9 (b). Figure 10 shows the CSC calculated using a total of 150, 300, 600, and 2400 particles initiated at the nozzle exit plane near the left of the frame between and . The CSC algorithm identifies all three vortex rings, which is in qualitative agreement to the dark ridges of the FTLE field.
While the resolution of the CSC contours increases with the number of particles, it is clear that 300 particles is sufficient to obtain a qualitatively similar result to the case with eight times as many particles, and to the FTLE calculation based on 30,500 particles.
The sensitivity of the CSC method to the size of the domain of particles and the time of release was also characterized using the vortex ring PIV data.
In figure 11(a), the entire flow field was initialized with randomly located particles at , and subsequently 1200 additional particles were added between and at the nozzle exit plane. Because the CSC algorithm groups regions with a low normalized standard deviation in relative particle separation, the nominally quiescent background flow was assigned a CSC value that contrasts most sharply with the entire starting jet flow. Consequently, details of the internal structure of the vortex ring and trailing jet are lost. However, if we recursively apply the CSC algorithm only to the flow trajectories in the starting jet, as shown in figure 11(b), we see that the algorithm is able to detect the structure of the primary, secondary, and tertiary vortex rings in greater detail, despite the fewer number of total particles.
4 Comparison with other graph theorybased methods
A related method for coherent structure detection that is also based on graph theory uses the concept of an eigengap heuristic to determine the number of coherent structures present in the flow (Hadjighasem et al., 2016). For this method, the weights assigned to the edges of the graph are equal to the reciprocal of the average distance between particle pairs. The generalized eigenvalue problem solved in this method is , where is the graph Laplacian, and is the diagonal degree matrix where is equal to the sum of the elements in row of the adjacency matrix . This method assumes that by examining the smallest eigenvalues of the generalized eigenvalue problem, the number of coherent structures in a flow can be determined by locating the largest numerical gap between successive eigenvalues; the number of eigenvalues before the gap is assumed to correspond to the number of coherent regions in the flow.
Here we present an analysis of the aforementioned technique and its response to several input variables, comparing its robustness to the method introduced in this paper. This analysis again uses the Bickley jet described by equations 12 and 13 and the values of the parameters listed in section 3.2. The response of the eigenvalue spectrum for the Bickley jet flow to changes in the initial position of tracer particles and to the value of the sparsification parameter are shown in figure 12 for 1920 particles and in figure 13 for 480 particles. For the case with 480 particles initialized randomly in the domain, the exact same particle trajectories were used as in the analysis in section 3.2 to allow for a direct comparison between the two methods.
Based on prior knowledge of the Bickley jet flow, we expect to resolve six coherent structures: the five full vortices flanking the meandering jet, and due to the periodic nature of the flow in the xdirection, two half vortices identified together as a sixth coherent structure. Thus, the eigengap heuristic should predict a numerical gap between the sixth and the seventh eigenvalues. From figure 12(a), (c) we can see that for 1920 particles, regardless of whether particles are initialized on a Cartesian grid or randomly throughout the domain, the largest gap in the smallest 20 eigenvalues is between the sixth and seventh eigenvalues, as expected. However, when the adjacency matrix is not sparsified to remove weights corresponding to particle pairs with an average distance greater than , as shown in figure 12(b), (d), the eigengap is located between the first and second eigenvalues for the random particle initialization and between the second and third eigenvalues for the gridded particle initialization. These results indicate that the eigengap heuristic is sensitive to the level of sparsification used in the adjacency matrix. Consequently, without prior knowledge of the size of the coherent structures to inform an appropriate level of sparsification, this method is not able to robustly determine the number of coherent structures in the flow.
The analysis of the eigenvalue spectrum for 480 particles, the same number used in the analysis of coherent structure coloring for the Bickley jet flow in section 3.2, is shown in figure 13. Here we see in figure 13(a) that for random particle initialization with sparsification, the eigengap is between the ninth and tenth eigenvalues. Additionally, if the particles are initialized on a grid (figure 13(c)), or sparsification is not used (figure 13(b)), or both (figure 13(d)), the eigengap is also not correctly identified. Based on this analysis, we can conclude that for small numbers of tracer particles, on the order of to , use of the eigengap heuristic to determine the number of coherent structures in the flow is ineffective based on the lack of robustness to initial tracer locations (often beyond the control of the investigator for experimental applications) and to the level of sparsification of the adjacency matrix (which requires a priori knowledge of the size of the coherent structures, if sparsification is to be used effectively).
Despite an incorrect identification of the eigengap, the coherent structures can theoretically still be identified using a Kmeans clustering of the eigenvectors associated with the eigenvalues below the eigengap according to this method. To be sure, if the eigengap heuristic overestimates the number of structures, some structures might be split into several, and if the number of structures is underestimated, several structures might be merged together. Thus, for the case where 480 particles were randomly initialized in the domain and sparsification was used in the analysis (i.e. figure 13(a)), the results of the Kmeans clustering is shown in figure 14
, and the nine eigenvectors used in the clustering analysis are shown in figure
15. The clustering analysis searched for ten groups corresponding to the nine coherent structures expected from the eigengap heuristic, in addition to one structure for the incoherent background flow. In figure 14, these ten clusters are plotted between two separate panels to aid in visualization of the individual clusters. From the clustering, we can see that the gray cluster roughly corresponds to the meandering jet, while the flanking vortices are somewhat consistent with the purple, yellow, and light green clusters on the top and the dark green, cyan, and magenta clusters on the bottom. The black, red and blue clusters identify only a few seemingly random particles each. Even if the three smallest clusters are ignored, the boundaries of the clusters corresponding to the vortices are significantly different from the boundaries of the vortices themselves, as observed in the FTLE analysis in figure 4(b). From observing the eigenvectors used for this analysis, it is evident that the first, second, and fifth eigenvectors (figure 15(a),(b),(e)) are responsible for the small clusters identified by the Kmeans analysis, while the remaining six eigenvectors roughly correspond to the boundaries of groups of the flanking vortices and meandering jet. Although not shown, if Kmeans clustering is performed using the third and fifth through ninth eigenvectors to identify seven clusters, the clusters identified are almost identical to the clusters in figure 14 excluding the small blue, red, and black clusters. However, the boundaries of these clusters are still not consistent with the boundaries of the jet and the vortices.5 Conclusions
This paper presents an algorithm for detecting coherence in flows where only sparse velocity data is available, as is often the case in particle tracking velocimetry, or oceanographic tracking of surface floats. In this regime, alternative methods evaluating coherence either require knowledge of the number of coherent structures a priori, or break down due to the sparsity of the data. The present method, based on the concepts of graph coloring and spectral graph drawing, examines the kinematic dissimilarity of every pair of trajectories, and organizes this data into a weighted adjacency matrix. As such, we consider a different definition of coherence from other coherent structure detection methods, which only consider groups of particles that remain close as time progresses without mixing with the surrounding fluid to be coherent structures. The eigenvector associated with the maximum eigenvalue of the generalized eigenvalue problem assigns a value of coherent structure coloring to each particle, such that similar CSC values indicate coherence in the flow. This algorithm has several inherent strengths, including that the number of coherent structures does not need to be known a priori. Additionally, information about all coherent structures in the flow is contained in a single eigenvector of the generalized eigenvalue problem associated with the graph Laplacian. The algorithm is also capable of detecting coherent structures with the small number of trajectories associated with many PTV and ocean drifter datasets, and was shown to be robust to different types of data loss common in particle/drifter tracking applications. Although only two dimensional datasets were examined here, the kinematic dissimilarity metric in equation 1 and the subsequent maximization problem can both be extended to higher dimensions without loss of generality and with limited additional computational cost, since the adjacency matrix scales with the square of the number of particles, independent of the dimensionality of their trajectories. The CSC method has the potential to be extended to analyze other properties of fluid flows in addition to coherence, and it may also find application in other data analysis problems for which coherent structure identification remains a challenge.
A MATLAB implementation of the CSC algorithm is available for free download at http://dabirilab.com/software.
Acknowledgements.
This work was supported by the U.S National Science Foundation and by the Department of Defense (DoD) through the National Defense Science Engineering Graduate Fellowship (NDSEG) Program.References
 Allshouse & Peacock (2015) Allshouse, M. & Peacock, T. 2015 Lagrangian based methods for coherent structure detection. Chaos 25 (097617).
 Allshouse & Thiffeault (2012) Allshouse, M. & Thiffeault, J.L. 2012 Detecting coherent structures using braids. Physica D 241, 95–105.
 Banisch & Koltai (2016) Banisch, R. & Koltai, P. 2016 Understanding the geometry of transport: diffusion maps for Lagrangian trajectory data unravel coherent sets. arXiv:1603.04709v2.
 Bezdek et al. (1984) Bezdek, J., Ehrlich, R. & Full, W. 1984 FCM: The fuzzy cmeans clustering algorithm. Computers & Geosciences 10, 191–203.
 Chang et al. (1984) Chang, T., Wilcox, N. & Tatterson, G. 1984 Application of image processing to the analysis of threedimensional flow fields. Optical Engineering 23 (3), 283–287.
 Davis (1991) Davis, R. 1991 Lagrangian ocean studies. Annu. Rev. Fluid Mech. 23, 43–64.
 Elsinga et al. (2006) Elsinga, G., Scarano, F., Wieneke, B. & van Oudheusden, B. 2006 Tomographic particle image velocimetry. Exp. in Fluids 41, 933–947.
 Froyland & PadbergGehle (2015) Froyland, G. & PadbergGehle, K. 2015 A roughandready clusterbased approach for extracting finitetime coherent sets from sparse and incomplete trajectory data. Chaos 25 (087406).
 Gharib et al. (1998) Gharib, M., Rambod, E. & Shariff, K. 1998 A universal time scale for vortex ring formation. J. Fluid Mech. 360, 121–140.

Hadjighasem et al. (2016)
Hadjighasem, A., Karrasch, D., Teramoto, H. & Haller,
G.
2016 Spectralclustering approach to Lagrangian vortex detection.
Phys. Rev. E 93 (063107).  Hall (1970) Hall, K. 1970 An rdimensional quadratic placement algorithm. Management Science 17 (3), 219–229.
 Haller (2005) Haller, G. 2005 An objective definition of a vortex. J. Fluid Mech. 525, 1–26.
 Haller (2015) Haller, G. 2015 Lagrangian coherent structures. Annu. Rev. Fluid Mech. 47, 137–162.
 Haller & Yuan (2000) Haller, G. & Yuan, G. 2000 Lagrangian coherent structures and mixing in twodimensional turbulence. Physica D 147, 352–370.
 Ho et al. (2014) Ho, J., Toloui, M., Chamorro, L., Guala, M., Howard, K., Riley, S., Tucker, J. & Sotiropoulos, F. 2014 Natural snowfall reveals largescale flow structures in the wake of a 2.5MW wind turbine. Nature Communications 5 (4216).
 Huhn et al. (2012) Huhn, F., von Kameke, A., PérezMuñuzuri, V., Olascoaga, M. & BeronVera, F. 2012 The impact of advective transport by the South Indian Ocean Countercurrent on the Madagascar plankton bloom. Geophysical Research Letters 39 (L06602).
 Hunt et al. (1988) Hunt, J., Wray, A. & Moin, P. 1988 Eddies, streams, and convergence zones in turbulent flows. Center for Turbulence Research Report CTRS88, 193–208.
 Jeong & Hussain (1995) Jeong, J. & Hussain, F. 1995 On the identification of a vortex. J. Fluid Mech. 285, 69–94.
 Katija & Dabiri (2008) Katija, K. & Dabiri, J. 2008 In situ field measurements of aquatic animalfluid interactions using a SelfContained Underwater Velocimetry Apparatus (SCUVA). Limnology and Oceanography: Methods 6, 162–171.
 Kim et al. (2013) Kim, D., Hussain, F. & Gharib, M. 2013 Vortex dynamics of clapping plates. J. Fluid Mech. 714, 5–23.
 Koren (2005) Koren, Y. 2005 Drawing graphs by eigenvectors: theory and practice. Computers and Mathematics with Applications 49, 1867–1888.
 Li et al. (2008) Li, D., Zhang, Y., Sun, Y. & Yan, W. 2008 A multiframe particle tracking algorithm robust against input noise. Meas. Sci. Technol. 19 (105401).
 Lüthi et al. (2005) Lüthi, B., Tsinober, A. & Kinzelbach, W. 2005 Lagrangian measurement of vorticity dynamics in turbulent flow. J. Fluid Mech. 528, 87–118.
 Muñoz et al. (2005) Muñoz, S., Ortuño, M.T., Ramírez, J & Yáñez, J. 2005 Coloring fuzzy graphs. Omega 33, 211–221.
 Murai et al. (2007) Murai, Y., Nakada, T., Suzuki, T. & Yamamoto, F. 2007 Particle tracking velocimetry applied to estimate the pressure field around a Savonius turbine. Meas. Sci. Technol. 18, 2491–2503.
 Okubo (1970) Okubo, A. 1970 Horizontal dispersion of floatable particles in the vicinity of velocity singularities such as convergences. DeepSea Research 17, 445–454.
 Racca & Dewey (1988) Racca, R. & Dewey, J. 1988 A method for automatic particle tracking in a threedimensional flow field. Exp. in Fluids 6, 25–32.
 Rypina et al. (2007) Rypina, I., Brown, M. & BeronVera, F. 2007 On the Lagrangian dynamics of atmospheric zonal jets and the permeability of the stratospheric polar vortex. Journal of the Atmopspheric Sciences 64, 3595–3610.
 Schanz et al. (2016) Schanz, D., Gesemann, S. & Schröder, A. 2016 Shakethebox: Lagrangian particle tracking at high particle image densities. Exp. in Fluids 57 (70).
 SchlueterKuck & Dabiri (2016) SchlueterKuck, K. & Dabiri, J. 2016 Pressure evolution in the shear layer of forming vortex rings. Phys. Rev. Fluids 1 (012501).
 Shadden et al. (2005) Shadden, S., Lekien, F. & Marsden, J. 2005 Definition and properties of Lagrangian coherent structures from finitetime Lyapunov exponents in twodimensional aperiodic flows. Physica D 212, 271–304.
 Virant & Dracos (1997) Virant, M. & Dracos, T. 1997 3D PTV and its application on Lagrangian motion. Meas. Sci. Technol. 8, 1539–1552.
 Weiss (1991) Weiss, J. 1991 The dynamics of enstrophy transfer in twodimensional hydrodynamics. Physica D 48, 273–294.
Comments
There are no comments yet.