Randomized methods to characterize large-scale vortical flow network

by   Zhe Bai, et al.
Berkeley Lab

We demonstrate the effective use of randomized methods for linear algebra to perform network-based analysis of complex vortical flows. Network theoretic approaches can reveal the connectivity structures among a set of vortical elements and analyze their collective dynamics. These approaches have recently been generalized to analyze high-dimensional turbulent flows, for which network computations can become prohibitively expensive. In this work, we propose efficient methods to approximate network quantities, such as the leading eigendecomposition of the adjacency matrix, using randomized methods. Specifically, we use the Nyström method to approximate the leading eigenvalues and eigenvectors, achieving significant computational savings and reduced memory requirements. The effectiveness of the proposed technique is demonstrated on two high-dimensional flow fields: two-dimensional flow past an airfoil and two-dimensional turbulence. We find that quasi-uniform column sampling outperforms uniform column sampling, while both feature the same computational complexity.



There are no comments yet.


page 3

page 7

page 12

page 13


i-flow: High-dimensional Integration and Sampling with Normalizing Flows

In many fields of science, high-dimensional integration is required. Num...

Randomized Robust Subspace Recovery for High Dimensional Data Matrices

This paper explores and analyzes two randomized designs for robust Princ...

On the Nyström and Column-Sampling Methods for the Approximate Principal Components Analysis of Large Data Sets

In this paper we analyze approximate methods for undertaking a principal...

Improved Subsampled Randomized Hadamard Transform for Linear SVM

Subsampled Randomized Hadamard Transform (SRHT), a popular random projec...

Randomized Value Functions via Multiplicative Normalizing Flows

Randomized value functions offer a promising approach towards the challe...

Self Normalizing Flows

Efficient gradient computation of the Jacobian determinant term is a cor...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Fluid dynamics is a rich and challenging field at the intersection of physics and engineering. There is still a tremendous amount that we do not understand about turbulence, yet working fluids are at the heart of nearly every major industry, including health, defense, transportation, and energy. One cannot overestimate the significance and impact that a better understanding of fluid flows would have in our ability to predict and manipulate their behavior in the real world. The challenge of modeling and controlling fluids stems from the fact that they are nonlinear with complex multi-scale interactions over a large range of spatial and temporal scales. Moreover, data from experiments and simulations are generally represented as exceedingly high-dimensional measurements that may obscure underlying patterns. With advances in simulation capabilities and measurement techniques, the volume and quality of such data are rapidly increasing.

Despite the complexity of the governing equations and the overwhelming volume of data, it is often observed that flows evolve on a low-dimensional attractor defined by a few dominant coherent structures [1]. There are a wealth of modal decomposition techniques to mine and characterize these structures from experimental data and numerical solvers [2, 3]. The majority of techniques, such as proper orthogonal decomposition (POD)[1, 4] and dynamic mode decomposition [5, 6, 7] are fundamentally linear, as are many of the standard techniques in control theory [8]. Linear control has been widely applied for flow control [9, 10, 11], for example to stabilize boundary layers. However, many flows are fundamentally nonlinear, limiting the broad use of linear control theory. Low-order nonlinear models may be obtained by Galerkin projection of the Navier Stokes equations onto POD modes, and these models have had great success for uncovering underling mechanisms that drive flows [4]. Regardless, there are issues with Galerkin models, such as instability and mode deformation with changing parameters and boundary conditions [12, 13, 14], limiting the success of POD-Galerkin modeling for turbulence.

Recently, network theoretic approaches have been increasingly leveraged to analyze complex, fluid flow systems [15, 16, 17, 18, 19, 20, 21]. Network science [22] characterizes the structure and dynamics on a graph consisting of nodes and the edges connecting them. It is possible to cast a fluid flow in this context, where each grid cell of vorticity is a node, and the induced velocity from the Biot-Savart interaction between each node establishes the edge connections [15]. In this way, individual packets of vorticity interact and evolve according to physics that is encoded in the graph. Network analysis provides a complementary perspective to classical techniques in fluid dynamics, especially for flow control. There have been many powerful advances in network control theory [22, 23] surrounding multi-agent systems [24, 23] in the past two decades. Multi-agent control is designed to be local, efficient to operate, and scalable to extremely large systems, such as the internet [25, 26] or the electric grid [27]. The multi-agent control and underlying network dynamics may also be strongly nonlinear, and these controllers are built to handle time-delays and communication failures, which are typical limiting factors for robust performance in classical control [8]. In particular, networks are often characterized by a large collection of elements, represented by nodes on a graph, that each execute their own set of local protocols in response to external stimulus. This analogy holds quite well for a number of large networked dynamical systems, including animals flocking [28, 29], multi-robotic cooperative control systems [30], sensor networks [31, 32], biological regulatory networks [33, 34], and the internet [25, 26], to name a few. A long-term goal of characterizing vortical networks is the eventual application of multi-agent control to school turbulence into a beneficial configuration for an engineering advantage. In contrast, past closed-loop flow control efforts have commonly involved applying linear control techniques to a suitable reduced-order model of the fluid [10].

1.1 Motivation

The application of network theory in fluid dynamics faces the challenge of an exceedingly large number of degrees of freedom (nodes) for a fully turbulent flow, leading to an even larger adjacency matrix of edges. Indeed, the adjacency matrix for a vortical flow network scales as the square of the number of fluid grid points, quickly becoming intractable to store and analyze using classical techniques from linear algebra. Recent approaches have been developed to manage this complexity, including building a graph on the

modes of a flow [35], given by dominant fluid coherent structures from POD, rather than the nodes, given by fluid grid elements. However, there is still interest in developing scalable methods to directly analyze the vortical network in the original ambient measurement space, where nodes are grid elements. Randomized methods for linear algebra provide an emerging alternative to efficiently compute an approximate eigendecomposition of large-scale matrices. These methods work with a reduced representation, a so-called sketch, of the input matrix that captures the essential spectral information. This sketch is obtained either through random sampling or by random projections. The sketch

can then be used to efficiently compute a desired low-rank approximation, such as the singular value decomposition. For a comprehensive survey and implementation details see 

[36, 37, 38, 39, 40].

1.2 Contributions

In this work, we use randomized methods to analyze large networks arising in fluid mechanics. Specifically we are interested in using randomized methods to obtain a qualitatively correct view of the dominant graph structures, which can then be used for the downstream tasks of sensor and actuator placement, optimization, and control. This application domain is the subject of the present work, where we explore the use of random sampling and randomized linear algebra to generate qualitatively accurate estimates of dominant flow structures for cases where the adjacency matrix is exceedingly large. We compare both uniform and quasi-uniform sampling, and demonstrate this approach on two high-dimensional vortical flow networks. A high-level principle sketch is provided in Figure 


[trim = 0, 0, 0, 0, clip, width=1.09]Schematic1 Input Flow FieldAdjacency MatrixColumnSampling

Row SamplingSketchEigenvectors& EigenvaluesPath IPath II (Nyström Method)

Figure 1: Identification of dominant eigenvalues and eigenvectors of the adjacency matrix obtained from a fluid flow field using randomized methods. Path randomly samples columns of the adjacency matrix to approximate the leading eigenvalues and eigenvectors; path uses both column and row sampling to form a sketch. Note, it is not required to explicitly construct the full adjacency matrix.

2 Vortical interaction network

Recently, network analysis has been employed to represent and analyze vortical interactions in flows [15, 17, 19, 35, 41]. In Nair and Taira [15], spectral sparsification was used to identify a low-dimensional skeleton underlying the high-dimensional, nonlinear fluid dynamics. Taira et al. [17] later demonstrated scale-free behavior of the vortex interaction network for two-dimensional isotropic turbulence. Network analysis has also been used by Meena et al. [19] for community detection and force prediction in an unsteady wake. More recently, interaction networks based on the energy transfer between modes has been used for modeling and control [35, 41]. These studies provide compelling evidence that the emerging network perspective may complement well-established flow modeling techniques.

A graph (network) consists of three components: a set of vertices (nodes) , a set of edges that connect these vertices, and the associated edge weights . In the present work, the vortical elements in a flow field represent the nodes of the vortical flow network, following the formulation of Nair and Taira [15]. The interaction among vortical elements can be quantified by the induced velocity. Following the Biot-Savart law, the induced velocity from element to in a two-dimensional flow is given by


where is the strength of vortical element with vorticity and grid area , and is the Euclidean distance between the vortical elements. This induced velocity represents the edge weights, and may be used to construct an adjacency matrix


which mathematically represents the interactions in the vortical network. Since a vortex cannot induce motion upon itself, there are no diagonal entries for the adjacency matrix. Here we have an undirected network with a symmetric adjacency matrix to enable the use of the algorithm described later. However, it is possible to define an asymmetric adjacency matrix for a directed network.

Given the vorticity field of a flow, the corresponding vortical interaction network can be formulated as above. This also suggests that the size of the adjacency matrix scales as where is the number of grid points or vortical elements in the flow field. For a high-fidelity simulation or experimental measurements of a flow field, can exceed , making the computation and analysis of the vortical network computationally expensive. Below, we discuss a few network measures that are particularly useful for understanding the interaction-based characteristics of the flow. The current work focuses on computing these measures for large, dense vortical networks in a computationally tractable manner.

2.1 Network measures and computations

The adjacency matrix is the basis for a number of useful network measures, such as eigenvector centrality, Katz centrality, and PageRank [22, 42]. The centrality of a network describes the most important or central elements based on measures that quantify the connection strength of the nodes. One of the most basic centrality measures is the node strength, given by , which measures the overall interaction strength of a node. The eigenvector centrality is another fundamental network property, which also considers the strength of the neighboring elements of a node and is used to identify the most influential nodes in the network [43]. It is computed as the eigenvector of corresponding to the largest eigenvalue :


Entries with large absolute value in the vector

correspond to influential nodes. The Katz centrality and PageRank are related and also based on computing the dominant eigenvector of the adjacency matrix.

Network analysis is also useful for graph partitioning and clustering [22], where the nodes in the network are grouped into clusters based on their close connectivity. Spectral partitioning or clustering is a common approach based on the dominant eigenvector of the adjacency matrix. Nodes can be divided and sub-divided into groups based on the sign of the corresponding elements in the dominant eigenvectors of . Spectral partitioning can also be used in community detection algorithms to identify groups of closely interacting nodes, called communities [44]. These algorithms perform a hierarchical grouping of nodes into communities based on maximizing the overall modular measure of the network. This hierarchical grouping can be performed using the dominant eigenvectors of [45]. In vortical flows, these network-based measures are useful for identifying the most influential vortical elements and clustering them into coherent structures or communities.

Due to the importance of the spectral information of the adjacency matrix, considerable effort has gone into developing accurate and efficient algorithms to compute dominant eigenvalues for large adjacency matrices. Provided that the largest eigenvalue is real and distinct, the dominant eigenvector can be efficiently computed using the power method [46]. A symmetric adjacency matrix with nonnegative entries will have purely real eigenvalues. For instance, the power method is commonly used to compute the dominant eigenvector of an adjacency matrix [47], and this method is particularly efficient for large sparse adjacency matrices. The computational costs of the power method for computing the dominant eigenvector scales as .

3 Random sampling-based matrix approximation

As stated above, we are interested in computing the eigendecomposition of a real symmetric matrix that takes the form


where the columns of the orthogonal matrix

are eigenvectors and the entries of the diagonal matrix are the corresponding eigenvalues . More concretely, we are interested in computing only the dominant eigenvectors and eigenvalues (where ) that yield the rank- approximation


where contains only the columns of that correspond to the largest eigenvalues. Here, we use tools from the field of randomized numerical linear algebra [48, 49, 36, 50, 51, 52, 40, 53, 38, 39, 54] to compute such a low-rank approximation efficiently. Indeed, randomized methods are widely used for computing low-rank approximations [55, 56, 57, 58, 59, 60, 61].

Scalability is achieved by forming a sketched representation of the input (adjacency) matrix which extracts the inherent spectral information. A sketch can be constructed by post-multiplying the adjacency matrix with a an arbitrary ‘sketching’ matrix

. The random matrix

represents either a random projection or a random sampling process.

  • Projection-based methods construct a ‘sketch’ by forming a set of randomly weighted linear combinations of the columns of the input (adjacency) matrix [62, 39, 48, 49]. This approach can substantially reduce the computational demands when computing a low-rank approximation. However, projection-based methods generally require at least a single pass over the entire data matrix.

  • Sampling-based methods aim to approximate the low-rank structure of the input (adjacency) matrix from a small random subsample of actual columns or rows of the matrix [63, 64, 65]. The sampling process is described in more detail below. In practice, sampling-based methods may bypass the construction of the full adjacency matrix, while sacrificing some accuracy for improved scalability.

In what follows, we generate qualitatively accurate estimates of dominant flow structures for cases where the adjacency matrix is exceedingly large. In this problem context, we require a qualitatively correct view of the dominant graph structures, which can be used for the downstream tasks of sensor and actuator placement, optimization, and control. Therefore, we investigate the use of column sampling to compute a sketched singular value decomposition and the Nyström Method to approximate the leading eigenvalues and eigenvectors. Both techniques have been successfully used to approximate large-scale matrices in other applications [66]. It is important to note that sampling-based methods provide tunable error bounds that are based on the number of sampled columns/rows, making them viable as an alternative to well-established deterministic algorithms.

3.1 Sketched singular value decomposition

Column sampling for low-rank matrix approximation dates back to the pioneering work of Frieze et al. [62]. Instead of computing the full eigendecomposition of , it is possible to form a rank- approximation by first sampling columns from the input matrix


where the matrix consists of a subset of columns of and describes the corresponding random sampling process. In practice, it is only necessary to form a list comprised of column indices


There are several sampling strategies to choose good columns, and the approximation error depends on the both the sampling process and the number of columns sampled.

In our problem setup, each column of corresponds to the network interaction of a single grid element in the original fluid flow field with every other grid element. Here, we compare uniform random sampling against (quasi-random) Halton sampling [67, 68]. Halton sampling provides better coverage of the domain, as illustrated in Figure 2, and our empirical results show that this translates into faster convergence of the approximation error with respect to the number of columns sampled.

If the columns of are carefully chosen, then provides a basis for the column space of the adjacency matrix . Note that here we assume that is a symmetric matrix. Thus, the dominant left singular vectors of , provide an approximation for the dominant eigenvectors of , i.e., we have . The corresponding eigenvalues are approximated by scaling the singular values of :


The cost of constructing the sketch matrix using uniform or Halton sampling is . Then, the cost of computing the SVD on is . For parallel computations, it is possible to compute and by taking the SVD of , which requires to be generated and for the corresponding SVD.

(a) Uniform random sampling.
(b) Halton random sampling.
Figure 2: Examples of two different random sampling approaches based on uniform sampling and Halton random sampling. Halton sampling provides a better coverage of the domain than uniform sampling does.

3.2 The Nyström method

The Nyström method provides an efficient approach for low-rank matrix approximations in large-scale learning applications. It was initially introduced as a quadrature method for numerical integration to approximate eigenfunction solutions. Recent work has streamlined these computations and provided theoretical foundations for the use of various sampling schemes 

[69, 70, 71, 72].

In addition to column sampling, the Nyström method further constructs a square matrix by selecting the rows and columns of :


When is symmetric and positive-semidefinite (SPSD), is also SPSD. The small matrix can be used to efficiently compute the dominant eigenvectors and eigenvalues of . Following [73], we first compute the eigendecomposition:


Then, we reconstruct the dominant eigenvalues as


and the corresponding eigenvectors


The Nyström method has a computational complexity of .

4 Results

We now demonstrate the use of sampling-based randomized decomposition techniques to characterize the vortical interaction networks for two example flows: the laminar wake flow past a NACA 0012 airfoil with a Gurney flap attached to the trailing edge [74], and a two-dimensional homogeneous decaying isotropic turbulence [17].

[width = 0.6]Snapshot_airfoil_NEW-eps-converted-to.pdf (a)[width = 0.4]Snapshot_turb_NEW-eps-converted-to.pdf (b)

Figure 3: Example flow fields. (a) Vorticity field of two-dimensional DNS of the flow over a NACA 0012 airfoil with Gurney flap at an angle of attack of and flap height of chord length at ; the full illustration can be found in [74]. A subdomain of the original vorticity field is used in this example. (b) Vorticity field of two-dimensional decaying homogeneous isotropic turbulence, initialized by an integral length-scale based Reynolds number of .

4.1 Numerical simulations

The first example is the flow past a NACA 0012 airfoil with a chord-based Reynolds number of , at an angle of attack of , and with a Gurney flap of chord length (). This setup generates an unsteady 2P wake [75, 74], with periodic shedding of two pairs of positive and negative vortices. This flow is solved using direct numerical simulations (DNS) via the immersed boundary projection method [76, 77], following the computational setup of Gopalakrishnan Meena et al. [74]. The computational domain consists of five nested levels of multidomains with the finest level of size and the largest being in size. All the domains have a grid resolution of . The time step is limited to a maximum Courant-Friedrichs-Lewy (CFL) number of . Figure 3 (a) shows the vorticity field in a sub-domian with a grid resolution of and size , nondimensionalized by the chord length of the airfoil.

The second example is two-dimensional decaying homogeneous isotropic turbulence, which is a canonical high-dimensional, mutiscale turbulent flow that exhibits complex nonlinear interaction of vortices over a wide range of length scales [78]. The vorticity field is obtained from a two-dimensional incompressible DNS, solving the vorticity transport equation without forcing [17]. The simulation, based on the Fourier spectral method and the fourth-order Runge-Kutta time integration scheme, is performed on a square biperiodic computational domain with grid resolution of . The vorticity field is initialized with vortices with random strengths, core sizes, and centers such that the kinetic energy spectra satisfies with [79]. For this study, the flow field is initialized by an integral length-scale based Reynolds number of , shown in Figure 3 (b).

The vorticity snapshots of both flows are used to construct the adjacency matrices of the corresponding vortical networks using Eq. 2. The size of the adjacency matrices are of the order and for the airfoil and turbulent flow, respectively, emphasizing the need for dimensionality reduction and computationally tractable analysis tools.

4.2 Computational performance

A naive application of network theory in fluid dynamics can be challenging, because the adjacency matrix is often too large to be stored or analyzed using classical techniques from linear algebra. There are two primary challenges associated with large vortical networks:

  • The size of the adjacency matrix, which describes the network edges, scales as the square of the number of fluid grid points;

  • Unlike social graphs, the vortical adjacency matrix is dense.

(a) Results for an adjacency matrix with .
(b) Results for an adjacency matrix with .
Figure 4: Computational performance for the flow over an airfoil using two different spatial resolutions. The results are averaged over runs using different random seeds.

Here, we use randomized methods to accelerate computations based on the vortical adjacency matrix. We sample a small number of columns from the adjacency matrix to compute a sketched SVD or Nyström approximation. Thus, the approximation quality can be controlled via the number of samples used to compute the low-rank approximation. In the context of networks arising in fluid mechanics, it turns out that sampling a small number of columns is sufficient for computing an approximation that is qualitatively useful. This leads to tremendous memory and computational savings, since the costs are proportional to the number of columns that we sample. We run all of our experiments using Amazon AWS, working on a memory optimized instance ‘x1.16xlarge’ that is powered by two Intel Xeon E7 8880 v3 (Haswell) processors with virtual cores and GB fast memory. Our algorithms are implemented in Python powered by MKL accelerated linear algebra routines, and C extensions for constructing the elements of the adjacency matrix.

We show in Figure 4 the approximation error as a function of the percentage of columns sampled and the corresponding computational time required.

Here, we consider two cropped spatial grids of different dimensions from which we construct the adjacency matrix. The error is reported as the acute angle between the true leading eigenvector and the approximate eigenvector , which provides a useful summary of how closely the two high-dimensional eigenvectors align. Since the algorithms are fundamentally based on random sampling, we average the results over initializations and report the distribution of errors.

Halton sampling results in considerably lower error than uniform sampling at the same percentage of columns sampled, which is consistent with the observation that it is sampling the flow domain more efficiently. The results based on Halton sampling also have considerably lower variance, indicating improved numerical robustness. Figures 

3(a) and 3(b) illustrate how substantially the computational time increases when the spatial resolution of the flow field is increased only slightly. In the case, only about of the columns need to be sampled for error compared between the approximate and true eigenvectors. In other words, we use only about of the information to compute the low-rank approximation, and this reduces the memory requirements for the example in Figure 3(b) from about GB to roughly GB. At the same time, the computational time is reduced since it is not necessary to construct the full adjacency matrix. The computational time is reduced further by using the Nyström method, while only sacrificing a small amount of accuracy. For comparison, Table 1 summarizes the computational demands and performance for the deterministic power iteration method. Figure 5 shows similar results for the turbulent flow. Again, we see that Halton sampling is favorable and that only a small fraction of columns is required to yield a good approximation of the leading eigenvector.

(a) For an adjacency matrix with .
(b) For an adjacency matrix with .
Figure 5: Computational performance for the isotropic turbulent flow using two different spatial resolutions. The results are averaged over runs using different random seeds.
Dimensions of
adjacency matrix
Time to construct
full adjacency matrix
Time to compute
deterministic eigenvector
airfoil (sec) 74.29 (GB) (sec)
airfoil (sec) 214.67 (GB) (sec)
turbulent (sec) 32.00 (GB) (sec)
turbulent (sec) 101.92 (GB) (sec)
Table 1: Summary of computational results for constructing the full adjacency matrix and computing the dominant eigenvector using the deterministic power method. Here the computational bottleneck is the memory required to construct the adjacency matrix. Note that the power method does not generally require that the full adjacency matrix is constructed explicitly; however, the power method is not efficient because the adjacency matrices we consider are dense.

4.3 Approximate eigenvectors and spectral clustering



(a) Snapshots.

(b) Dominant eigenvectors.

(c) Approximate eigenvectors.
Figure 6: Qualitative comparison of the deterministic (b) and approximate (c) dominant eigenvector of the adjacency matrix generated from the flow field in (a). The top row shows the results for the airfoil wake and the bottom shows the results for the two-dimensional isotropic turbulence. The approximation in (c) uses the Nyström method with Halton sampling with 10% of the columns of the adjacency matrix.

Figure 6 shows the leading eigenvectors of the adjacency matrix for the two flow examples, computed using deterministic and randomized methods. By visual inspection, the Nyström method with quasi-random Halton sampling provides an excellent approximation of the leading eigenvector using only of the columns of the adjacency matrix. The resulting eigenvector centrality of the matrix highlights strong vortex cores in both flows, as shown in Figure 6 (b). Vortices with lower strength have less centrality, emphasizing their decreased role in influencing the flow field. The 2P wake structure is clearly visible in the flow past an airfoil, and some shear layer structures are present in the 2D turbulent flow. The irrotational regions in both flows have the lowest centrality measure. These observations complement the traditional fluid dynamics literature regarding highly influential structures in a vortical flow. The approximate eigenvector centrality measures capture both the highly influential vortex cores as well as the smaller, less influential vortices, as seen in both flows. Moreover, the network-based measures take into account the spatial arrangements of the vortices to assess their influence.

In the computational comparison above, we only considered relatively low resolution flow fields because it was computationally prohibitive to compute the ground truth eigenvectors using deterministic methods. However, with randomized techniques it is possible to compute the leading eigenvector of the adjacency matrix for a resolution grid and a resolution grid, respectively, as shown in Figure 7. The adjacency matrix for the grid would require about TB of storage, if explicitly constructed. Instead, we sample only about of the columns to compute the Nyström approximation. While we cannot quantify the approximation error, we can see by visual inspection that the eigenvector captures dominant flow structures.

(a) Eigenvector of dimension .

(b) Eigenvector of dimension .
Figure 7: Approximated leading eigenvectors of the adjacency matrices for higher-resolution isotropic flow fields. Here, we are using the Nyström method with Halton sampling. By visual inspection we can see that sampling only about of the columns of the full adjacency matrix is sufficient for computing an approximated eigenvector that captures the dominant graph structure.

Finally, we investigate spectral clustering, which is one of the common uses of the leading eigenvectors of the adjacency matrix, providing a principled approach to cluster nodes into common communities. Figure 

8 shows the results of spectral clustering based on the three leading eigenvectors obtained from the adjacency matrix. In both flow fields, the approximate clusters obtained using randomized methods closely resemble the clusters computed via deterministic eigendecomposition. The right panels in Figure 8 further shows the data plotted in these first three eigenvector coordinates to visualize the clusters and subspaces. Typically, these leading eigenvectors are quite useful for determining relevant community structures within a network. These communities have an important physical interpretation in vortical flow networks, hierarchically identifying and organizing vortex cores within the flow. Because these communities are often determined from the few dominant eigenvectors, the randomized approach here can provide considerable computational advantages. Further, since high-vorticity nodes have a large influence on the network interactions, it may be possible to improve convergence by preferential sampling using these community structures.

(a) Exact clusters.
(b) Approximation.
(c) Scores (1 vs. 2).
(d) Scores (1 vs. 3).
Figure 8: Spectral clustering using the top three eigenvectors of the adjacency matrix for the airfoil wake flow. In (a) we show the clusters that we found using the exact eigenvectors. It can be seen that the approximate eigenvectors allow us to reveal the same seven clusters, as shown in (b). In (c) and (d) we show the first three approximated eigenvector coordinates to visualize the clusters and subspaces.

5 Discussion

In this work, we have demonstrated the effective use of scalable algorithms from randomized linear algebra to accelerate network computations for large-scale fluid flows, as demonstrated on the wake behind an airfoil and two-dimensional isotropic turbulent flow. In particular, we have used sampling-based methods to approximate the leading eigenvectors of the adjacency matrix of the vortical interaction network for cases where the data is so large that even single-pass algorithms are prohibitively expensive. Combining importance sampling, based on the probability distribution of detected communities, and the Nyström method, the leading eigenvector can be accurately computed at a fraction of the computational cost and memory requirement, bypassing the need to construct or query the full

matrix. We also showed that quasi-random Halton sampling techniques outperform uniform sampling in both examples as they provide more comprehensive sampling coverage of the spatial domain.

There are a number of interesting future directions based on this work. Further study is required to apply these randomized network based analysis techniques to more complex three-dimensional turbulent flows. In addition, there are similar scaling issues in the related fields of almost invariant sets and set-oriented methods [80, 81, 82, 83, 84]

, where the state-transition operator may be viewed as a graph on the high-dimensional state-space. In the transfer operator case, machine learning has been used to identify a data-driven discretization of the state-space into clusters, which serve as nodes that scale with the intrinsic rank of the attractor, rather than the high-dimensional measurement dimension 

[85]; similar clustering has been used to identify subspaces for POD–Galerkin models [86]. These methods have been used to determine eigenvectors of the Perron–Frobenius operator, the dual of the Koopman operator, establishing a strong connection between sensitivity and coherence in fluid flows. It will be interesting to mathematically connect those approaches with the randomized methods considered in the present study.


We acknowledge support from the Army Research Office (W911NF-17-1-0118) and the Air Force Office of Scientific Research (FA9550-16-1-0650). ZB also acknowledges support by the U.S. Department of Energy, Office of Science, under Award Number DE-AC02-05CH11231. NBE would also like to acknowledge Amazon Web Services for supporting the project with EC2 credits. We would like to thank Bing Brunton, Nathan Kutz, Jean-Christophe Loiseau, Krithika Manohar, Aditya Nair, and Bernd Noack for valuable discussions.


  • [1] P. J. Holmes, J. L. Lumley, G. Berkooz, and C. W. Rowley, Turbulence, coherent structures, dynamical systems and symmetry. Cambridge Monographs in Mechanics, Cambridge, England: Cambridge University Press, 2nd ed., 2012.
  • [2] K. Taira, S. L. Brunton, S. Dawson, C. W. Rowley, T. Colonius, B. J. McKeon, O. T. Schmidt, S. Gordeyev, V. Theofilis, and L. S. Ukeiley, “Modal analysis of fluid flows: An overview,” AIAA Journal, vol. 55, no. 12, pp. 4013–4041, 2017.
  • [3] K. Taira, M. S. Hemati, S. L. Brunton, Y. Sun, K. Duraisamy, S. Bagheri, S. Dawson, and C.-A. Yeh, “Modal analysis of fluid flows: Applications and outlook,” accepted, AIAA Journal, 2019.
  • [4] B. R. Noack, K. Afanasiev, M. Morzynski, G. Tadmor, and F. Thiele, “A hierarchy of low-dimensional models for the transient and post-transient cylinder wake,” Journal of Fluid Mechanics, vol. 497, pp. 335–363, 2003.
  • [5] P. J. Schmid, “Dynamic mode decomposition of numerical and experimental data,” Journal of Fluid Mechanics, vol. 656, pp. 5–28, Aug. 2010.
  • [6] C. W. Rowley, I. Mezić, S. Bagheri, P. Schlatter, and D. Henningson, “Spectral analysis of nonlinear flows,” Journal of Fluid Mechanics, vol. 645, pp. 115–127, 2009.
  • [7] J. N. Kutz, S. L. Brunton, B. W. Brunton, and J. L. Proctor, Dynamic Mode Decomposition: Data-Driven Modeling of Complex Systems. SIAM, 2016.
  • [8] G. E. Dullerud and F. Paganini, A course in robust control theory: A convex approach. Texts in Applied Mathematics, Berlin, Heidelberg: Springer, 2000.
  • [9] S. Bagheri, J. Hoepffner, P. J. Schmid, and D. S. Henningson, “Input-output analysis and control design applied to a linear model of spatially developing flows,” Applied Mechanics Reviews, vol. 62, no. 2, pp. 020803–1–020803–27, 2009.
  • [10] S. L. Brunton and B. R. Noack, “Closed-loop turbulence control: Progress and challenges,” Applied Mechanics Reviews, vol. 67, pp. 050801–1–050801–48, 2015.
  • [11] D. Sipp and P. J. Schmid, “Linear closed-loop control of fluid instabilities and noise-induced perturbations: A review of approaches and tools,” Applied Mechanics Reviews, vol. 68, no. 2, p. 020801, 2016.
  • [12] K. Carlberg, M. Barone, and H. Antil, “Galerkin v. least-squares petrov–galerkin projection in nonlinear model reduction,” Journal of Computational Physics, vol. 330, pp. 693–734, 2017.
  • [13] J.-C. Loiseau and S. L. Brunton, “Constrained sparse Galerkin regression,” Journal of Fluid Mechanics, vol. 838, pp. 42–67, 2018.
  • [14] J.-C. Loiseau, B. R. Noack, and S. L. Brunton, “Sparse reduced-order modeling: sensor-based dynamics to full-state estimation,” Journal of Fluid Mechanics, vol. 844, pp. 459–490, 2018.
  • [15] A. G. Nair and K. Taira, “Network-theoretic approach to sparsified discrete vortex dynamics,” Journal of Fluid Mechanics, vol. 768, pp. 549–571, 2015.
  • [16] M. Murugesan and R. Sujith, “Combustion noise is scale-free: transition from scale-free to order at the onset of thermoacoustic instability,” Journal of Fluid Mechanics, vol. 772, pp. 225–245, 2015.
  • [17] K. Taira, A. G. Nair, and S. L. Brunton, “Network structure of two-dimensional decaying isotropic turbulence,” Journal of Fluid Mechanics, vol. 795, 2016.
  • [18] K. L. Schlueter-Kuck and J. O. Dabiri, “Coherent structure colouring: identification of coherent structures from sparse data using graph theory,” Journal of Fluid Mechanics, vol. 811, pp. 468–486, 2017.
  • [19] M. Gopalakrishnan Meena, A. G. Nair, and K. Taira, “Network community-based model reduction for vortical flows,” Physical Review E, vol. 97, no. 6, p. 063103, 2018.
  • [20] S. Murayama, H. Kinugawa, I. T. Tokuda, and H. Gotoda, “Characterization and detection of thermoacoustic combustion oscillations based on statistical complexity and complex-network theory,” Physical Review E, vol. 97, no. 2, p. 022223, 2018.
  • [21] G. Iacobello, S. Scarsoglio, J. Kuerten, and L. Ridolfi, “Lagrangian network analysis of turbulent mixing,” Journal of Fluid Mechanics, vol. 865, pp. 546–562, 2019.
  • [22] M. E. J. Newman, Networks: an introduction. Oxford Univ. Press, 2010.
  • [23] M. Mesbahi and M. Egerstedt, Graph theoretic methods in multiagent networks. Princeton University Press, 2010.
  • [24] A. Rahmani, M. Ji, M. Mesbahi, and M. Egerstedt, “Controllability of multi-agent systems from a graph-theoretic perspective,” SIAM J. Control and Optimization, vol. 48, no. 1, pp. 162–186, 2009.
  • [25] S. H. Low, F. Paganini, and J. C. Doyle, “Internet congestion control,” Control Systems, IEEE, vol. 22, no. 1, pp. 28–43, 2002.
  • [26] J. C. Doyle, D. L. Alderson, L. Li, S. Low, M. Roughan, S. Shalunov, R. Tanaka, and W. Willinger, “The “robust yet fragile” nature of the internet,” Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 41, pp. 14497–14502, 2005.
  • [27] Y. Susuki, I. Mezić, and T. Hikihara, “Coherent swing instability of power grids,” Journal of nonlinear science, vol. 21, no. 3, pp. 403–439, 2011.
  • [28] N. E. Leonard and E. Fiorelli, “Virtual leaders, artificial potentials and coordinated control of groups,” in Decision and Control, 2001. Proceedings of the 40th IEEE Conference on, vol. 3, pp. 2968–2973, IEEE, 2001.
  • [29] R. Olfati-Saber, “Flocking for multi-agent dynamic systems: Algorithms and theory,” Automatic Control, IEEE Transactions on, vol. 51, no. 3, pp. 401–420, 2006.
  • [30] T. Balch and R. C. Arkin, “Behavior-based formation control for multirobot teams,” Robotics and Automation, IEEE Transactions on, vol. 14, no. 6, pp. 926–939, 1998.
  • [31] J. Cortes, S. Martinez, T. Karatas, and F. Bullo, “Coverage control for mobile sensing networks,” in IEEE International Conference on Robotics and Automation (ICRA), vol. 2, pp. 1327–1332, IEEE, 2002.
  • [32] N. E. Leonard, D. A. Paley, F. Lekien, R. Sepulchre, D. M. Fratantoni, and R. E. Davis, “Collective motion, sensor networks, and ocean sampling,” Proceedings of the IEEE, vol. 95, no. 1, pp. 48–74, 2007.
  • [33] R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon, “Network motifs: simple building blocks of complex networks,” Science, vol. 298, no. 5594, pp. 824–827, 2002.
  • [34] N. M. Luscombe, M. M. Babu, H. Yu, M. Snyder, S. A. Teichmann, and M. Gerstein, “Genomic analysis of regulatory network dynamics reveals large topological changes,” Nature, vol. 431, no. 7006, pp. 308–312, 2004.
  • [35] A. G. Nair, K. Taira, and S. L. Brunton, “Networked oscillator based modeling and control of unsteady wake flows,” Physical Review E, vol. 97, no. 063107, 2018.
  • [36] N. Halko, P.-G. Martinsson, and J. A. Tropp, “Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions,” SIAM Review, vol. 53, no. 2, pp. 217–288, 2011.
  • [37] P. Drineas and M. W. Mahoney, “Randnla: Randomized numerical linear algebra,” Commun. ACM, vol. 59, pp. 80–90, May 2016.
  • [38] R. Kannan and S. Vempala, “Randomized algorithms in numerical linear algebra,” Acta Numerica, vol. 26, pp. 95–135, 2017.
  • [39] D. P. Woodruff et al., “Sketching as a tool for numerical linear algebra,” Foundations and Trends® in Theoretical Computer Science, vol. 10, no. 1–2, pp. 1–157, 2014.
  • [40] N. B. Erichson, S. Voronin, S. L. Brunton, and J. N. Kutz, “Randomized matrix decompositions using R,” Journal of Statistical Software, vol. 89, no. 11, pp. 1–48, 2019.
  • [41] A. G. Nair, C.-A. Yeh, E. Kaiser, B. R. Noack, S. L. Brunton, and K. Taira, “Cluster-based feedback control of turbulent post-stall separated flows,” Journal of Fluid Mechanics, accepted, 2019.
  • [42] L. Page, S. Brin, R. Motwani, and T. Winograd, “The pagerank citation ranking: Bringing order to the web.,” tech. rep., Stanford InfoLab, 1999.
  • [43] E. D. Kolaczyk and G. Csárdi, Statistical analysis of network data with R, vol. 65. Springer, 2014.
  • [44] A. Clauset, M. E. Newman, and C. Moore, “Finding community structure in very large networks,” Physical review E, vol. 70, no. 6, p. 066111, 2004.
  • [45] E. A. Leicht and M. E. Newman, “Community structure in directed networks,” Physical review letters, vol. 100, no. 11, p. 118703, 2008.
  • [46] L. N. Trefethen and D. Bau III, Numerical Linear Algebra, vol. 50. SIAM, 1997.
  • [47] K. Bryan and T. Leise, “The $25,000,000,000 eigenvector: The linear algebra behind Google,” SIAM Review, vol. 48, no. 3, pp. 569–581, 2006.
  • [48] M. W. Mahoney, “Randomized algorithms for matrices and data,” Foundations and Trends in Machine Learning, vol. 3, pp. 123–224, 2011.
  • [49] P. Drineas and M. W. Mahoney, “Randnla: Randomized numerical linear algebra,” Commun. ACM, vol. 59, pp. 80–90, May 2016.
  • [50] E. Liberty, F. Woolfe, P.-G. Martinsson, V. Rokhlin, and M. Tygert, “Randomized algorithms for the low-rank approximation of matrices,” Proceedings of the National Academy of Sciences, vol. 104, no. 51, pp. 20167–20172, 2007.
  • [51] T. Sarlos, “Improved approximation algorithms for large matrices via random projections,” in Foundations of Computer Science. 47th Annual IEEE Symposium on, pp. 143–152, 2006.
  • [52] P.-G. Martinsson, V. Rokhlin, and M. Tygert, “A randomized algorithm for the decomposition of matrices,” Applied and Computational Harmonic Analysis, vol. 30, pp. 47–68, 2011.
  • [53] N. B. Erichson, S. L. Brunton, and J. N. Kutz, “Compressed singular value decomposition for image and video processing,” in

    Proceedings of the IEEE International Conference on Computer Vision

    , pp. 1880–1888, 2017.
  • [54] F. Woolfe, E. Liberty, V. Rokhlin, and M. Tygert, “A fast randomized algorithm for the approximation of matrices,” Journal of Applied and Computational Harmonic Analysis, vol. 25, pp. 335–366, 2008.
  • [55] E. Liberty, “Simple and deterministic matrix sketching,” in Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 581–588, ACM, 2013.
  • [56] N. B. Erichson and C. Donovan, “Randomized low-rank dynamic mode decomposition for motion detection,” Computer Vision and Image Understanding, vol. 146, pp. 40–50, 2016.
  • [57] N. B. Erichson, K. Manohar, S. L. Brunton, and J. N. Kutz, “Randomized cp tensor decomposition,” arXiv preprint arXiv:1703.09074, 2017.
  • [58] N. B. Erichson, A. Mendible, S. Wihlborn, and J. N. Kutz, “Randomized nonnegative matrix factorization,” Pattern Recognition Letters, vol. 104, pp. 1–7, 2018.
  • [59] N. B. Erichson, S. L. Brunton, and J. N. Kutz, “Compressed dynamic mode decomposition for background modeling,” Journal of Real-Time Image Processing, pp. 1–14, 2016.
  • [60] G. Shabat, Y. Shmueli, Y. Aizenbud, and A. Averbuch, “Randomized LU decomposition,” Applied and Computational Harmonic Analysis, vol. 44, no. 2, pp. 246–272, 2018.
  • [61]

    V. Rokhlin, A. Szlam, and M. Tygert, “A randomized algorithm for principal component analysis,”

    SIAM Journal on Matrix Analysis and Applications, vol. 31, pp. 1100–1124, 2009.
  • [62] A. Frieze, R. Kannan, and S. Vempala, “Fast Monte-Carlo algorithms for finding low-rank approximations,” Journal of the ACM, vol. 51, no. 6, pp. 1025–1041, 2004.
  • [63] P. Ma, M. W. Mahoney, and B. Yu, “A statistical perspective on algorithmic leveraging,” Journal of Machine Learning Research, vol. 16, no. 1, pp. 861–911, 2015.
  • [64] P. Drineas, M. W. Mahoney, and S. Muthukrishnan, “Sampling algorithms for l 2 regression and applications,” in Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, pp. 1127–1136, Society for Industrial and Applied Mathematics, 2006.
  • [65] P. Drineas, M. Magdon-Ismail, M. W. Mahoney, and D. P. Woodruff, “Fast approximation of matrix coherence and statistical leverage,” Journal of Machine Learning Research, vol. 13, no. Dec, pp. 3475–3506, 2012.
  • [66] A. Talwalkar, S. Kumar, M. Mohri, and H. Rowley, “Large-scale SVD and manifold learning,” Journal of Machine Learning Research, vol. 14, no. 1, pp. 3129–3152, 2013.
  • [67] H. Niederreiter, Random number generation and quasi-Monte Carlo methods. SIAM, 1992.
  • [68] J. H. Halton, “Algorithm 247: Radical-inverse quasi-random point sequence,” Commun. ACM, vol. 7, pp. 701–702, Dec. 1964.
  • [69] C. K. Williams and M. Seeger, “Using the nyström method to speed up kernel machines,” in Advances in neural information processing systems, pp. 682–688, 2001.
  • [70] P. Drineas and M. W. Mahoney, “On the Nyström method for approximating a Gram matrix for improved kernel-based learning,” journal of machine learning research, vol. 6, no. Dec, pp. 2153–2175, 2005.
  • [71] K. Zhang, I. W. Tsang, and J. T. Kwok, “Improved Nyström low-rank approximation and error analysis,” in Proceedings of the 25th international conference on Machine learning, pp. 1232–1239, ACM, 2008.
  • [72] A. Gittens and M. W. Mahoney, “Revisiting the nyström method for improved large-scale machine learning,” The Journal of Machine Learning Research, vol. 17, no. 1, pp. 3977–4041, 2016.
  • [73] S. Kumar, M. Mohri, and A. Talwalkar, “Sampling methods for the Nyström method,” Journal of Machine Learning Research, vol. 13, no. Apr, pp. 981–1006, 2012.
  • [74] M. Gopalakrishnan Meena, K. Taira, and K. Asai, “Airfoil-wake modification with gurney flap at low reynolds number,” AIAA Journal, vol. 56, no. 4, pp. 1348–1359, 2017.
  • [75] C. H. Williamson and A. Roshko, “Vortex formation in the wake of an oscillating cylinder,” Journal of Fluids and Structures, vol. 2, no. 4, pp. 355–381, 1988.
  • [76] K. Taira and T. Colonius, “The immersed boundary method: a projection approach.,” Journal of Computational Physics, vol. 225, no. 2, pp. 2118–2137, 2007.
  • [77] T. Colonius and K. Taira, “A fast immersed boundary method using a nullspace approach and multi-domain far-field boundary conditions,” Computer Methods in Applied Mechanics and Engineering, vol. 197, pp. 2131–2146, 2008.
  • [78] G. Boffetta and R. E. Ecke, “Two-dimensional turbulence,” Annual Review of Fluid Mechanics, vol. 44, pp. 427–451, 2012.
  • [79] S. Kida, “Numerical simulation of two-dimensional turbulence with high-symmetry,” Journal of the Physical Society of Japan, vol. 54, no. 8, pp. 2840–2854, 1985.
  • [80] M. Dellnitz, G. Froyland, and O. Junge, “The algorithms behind gaio—set oriented numerical methods for dynamical systems,” in Ergodic theory, analysis, and efficient simulation of dynamical systems, pp. 145–174, Springer, 2001.
  • [81] M. Dellnitz and O. Junge, “Set oriented numerical methods for dynamical systems,” Handbook of dynamical systems, vol. 2, pp. 221–264, 2002.
  • [82] G. Froyland and K. Padberg, “Almost-invariant sets and invariant manifolds – connecting probabilistic and geometric descriptions of coherent structures in flows,” Physica D, vol. 238, pp. 1507–1523, 2009.
  • [83] G. Froyland, N. Santitissadeekorn, and A. Monahan, “Transport in time-dependent dynamical systems: Finite-time coherent sets,” Chaos, vol. 20, no. 4, pp. 043116–1–043116–16, 2010.
  • [84] P. Tallapragada and S. D. Ross, “A set oriented definition of finite-time Lyapunov exponents and coherent sets,” Communications in Nonlinear Science and Numerical Simulation, vol. 18, pp. 1106–1126, May 2013.
  • [85] E. Kaiser, B. R. Noack, L. Cordier, A. Spohn, M. Segond, M. Abel, G. Daviller, J. Osth, S. Krajnovic, and R. K. Niven, “Cluster-based reduced-order modelling of a mixing layer,” J. Fluid Mech., vol. 754, pp. 365–414, 2014.
  • [86] D. Amsallem, M. J. Zahr, and C. Farhat, “Nonlinear model order reduction based on local reduced-order bases,” International Journal for Numerical Methods in Engineering, vol. 92, no. 10, pp. 891–916, 2012.