Simulating Continuum Mechanics with Multi-Scale Graph Neural Networks

06/09/2021 ∙ by Mario Lino, et al. ∙ Imperial College London 47

Continuum mechanics simulators, numerically solving one or more partial differential equations, are essential tools in many areas of science and engineering, but their performance often limits application in practice. Recent modern machine learning approaches have demonstrated their ability to accelerate spatio-temporal predictions, although, with only moderate accuracy in comparison. Here we introduce MultiScaleGNN, a novel multi-scale graph neural network model for learning to infer unsteady continuum mechanics. MultiScaleGNN represents the physical domain as an unstructured set of nodes, and it constructs one or more graphs, each of them encoding different scales of spatial resolution. Successive learnt message passing between these graphs improves the ability of GNNs to capture and forecast the system state in problems encompassing a range of length scales. Using graph representations, MultiScaleGNN can impose periodic boundary conditions as an inductive bias on the edges in the graphs, and achieve independence to the nodes' positions. We demonstrate this method on advection problems and incompressible fluid dynamics. Our results show that the proposed model can generalise from uniform advection fields to high-gradient fields on complex domains at test time and infer long-term Navier-Stokes solutions within a range of Reynolds numbers. Simulations obtained with MultiScaleGNN are between two and four orders of magnitude faster than the ones on which it was trained.



There are no comments yet.


page 7

page 13

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Forecasting the spatio-temporal mechanics of continuous systems is a common requirement in many areas of science and engineering. Continuum mechanics models often describe the underlying physical laws by one or more partial differential equations (PDEs), whose complexity may preclude their analytic solution. Numerical methods are well-established for approximately solving PDEs by discretising the the governing PDEs to form an algebraic system which can be solved Karniadakis and Sherwin (2013); Cantwell et al. (2015)

. Such solvers can achieve high levels of accuracy, but they are computationally expensive. Deep learning techniques have recently been shown to accelerate physical simulations without significantly penalising accuracy. This has motivated an increasing interest in the use of deep neural networks (DNNs) to produce realistic real-time animations for computer graphics

Tompson et al. (2017); Kim et al. (2019); Sanchez-Gonzalez et al. (2020) and to speed-up simulations in engineering design and control Guo et al. (2016); Yilmaz and German (2017); Zhang et al. (2018); Thuerey et al. (2018).

Most of the recent work on deep learning to infer continuous physics has focused on convolutional neural networks (CNNs)

Wiewel et al. (2019); Xiao et al. (2018); Lino et al. (2020). In part, the success of CNNs for these problems lies in their translation invariance and locality Goodfellow et al. (2016), which represent strong and desirable inductive biases for continuum-mechanics models. However, CNNs constrain input and output fields to be defined on rectangular domains representated by regular grids, which is not suitable for more complex domains. As for traditional numerical techniques, it is desirable to be able to adjust the resolution over space, devoting more effort where the physics are challenging to resolve, and less effort elsewhere. An alternative approach to applying deep learning to geometrically and topologically complex domains are graph neural networks (GNNs), which can also be designed to satisfy spatial invariance and locality Battaglia et al. (2018); Wu et al. (2020). In this paper we describe a novel approach to applying GNNs for accurately forecasting the evolution of physical systems in complex and irregular domains.

(a) Advection (b) Incompressible flow
Figure 1: MultiScaleGNN forecasts continuous dynamics. We used it to simulate (a) advection [video] and, (b) the incompressible flow around a circular cylinder [video]. In (a) the upper-lower and left-right boundaries are periodic. In (b) the upper-lower boundaries are periodic. The upper figure shows the

field predicted at a set of nodes obtained with adaptive coarsening/refinement, the bottom figure shows the interpolated field.


We propose MultiScaleGNN, a multi-scale GNN model, to forecast the spatio-temporal evolution of continuous systems discretised as unstructured sets of nodes. Each scale processes the information at different resolutions, enabling the network to more accurately and efficiently capture complex physical systems, such as fluid dynamics, with the same number of learnable parameters as a single level GNN. We apply MultiScaleGNN to simulate two unsteady fluid dynamics problems: advection and incompressible fluid flow around a circular cylinder within the time-periodic laminar vortex-shedding regime. We evaluate the improvement in accuracy over long roll-outs as the number of scales in the model are increased. Importantly, we show that the proposed model is independent of the spatial discretisation and the mean absolute error (MAE) decreases linearly as the distance between nodes is reduced. We exploit this to apply adaptive graph coarsening and refinement. We also demonstrate a technique to impose periodic boundary conditions as an inductive bias on the graph network connectivity. MultiScaleGNN simulations are between two and four orders of magnitude faster than the numerical solver used for generating the ground truth datasets, becoming a potential surrogate model for fast predictions.

2 Background and related work

Deep learning on graphs

A directed graph, , is defined as , where is a set of nodes and is a set of directed edges, each of them connecting a sender node to a receiver node . Each node

can be assigned a vector of node attributes

, and each edge can be assigned a vector of edge attributes . Gilmer, et al. (2017) Gilmer et al. (2017) introduced neural message passing (MP), a layer that updates the node attributes given the current node and edge attributes, as well as a set of learnable parameters. Battaglia et al. (2018) Battaglia et al. (2018) further generalised MP layers by introducing an edge-update function. Their MP layer performs three steps: edge update, aggregation and node update —equations (1), (2) and (3) respectively.


Functions and

are the edge and node update functions —typically multi-layer perceptrons (MLPs). The aggregation function

is applied to each subset of updated edges going to node , and it must be permutation invariant. This MP layer has already proved successful in a range of data-driven physics problems Sanchez-Gonzalez et al. (2020); Pfaff et al. (2021); Battaglia et al. (2016).

CNNs for simulating continuous mechanics.

During the last five years, most of the DNNs used for predicting continuous physics have included convolutional layers. For instance, CNNs have been used to solve the Poisson’s equation (Tang et al., 2018; Özbay et al., 2019), to solve the steady N-S equations (Guo et al., 2016; Yilmaz and German, 2017; Zhang et al., 2018; Thuerey et al., 2018; Miyanawala and Jaiman, 2018), and to simulate unsteady fluid flows (Kim et al., 2019; Wiewel et al., 2019; Lino et al., 2020; Lee and You, 2019; Sorteberg et al., 2018; Fotiadis et al., 2020). These CNN-based solvers are between one and four orders of magnitude faster than numerical solvers (Guo et al., 2016; Lino et al., 2020), and some of them have shown good extrapolation to unseen domain geometries and initial conditions (Thuerey et al., 2018; Lino et al., 2020).

Simulating physics with GNNs.

Recently, GNNs have been used to simulate the motion of discrete systems of solid particles Battaglia et al. (2016); Chang et al. (2017); Sanchez-Gonzalez et al. (2018) and deformable solids and fluids discretised into lagrangian (or free) particles Sanchez-Gonzalez et al. (2020); Li et al. (2019a); Mrowca et al. (2018a); Li et al. (2019b). Battaglia, et al. (2016) Battaglia et al. (2016) and Chang, et al. (2016) Chang et al. (2017) represented systems of particles as graphs, whose nodes encode the state of the particles and whose edges encode their pairwise interactions; nodes and edges are individually updated through a single MP layer to determine the evolution of the system. Of note is the work of Sanchez, et al. (2020), who discretised fluids into a set of free particles in order to apply the previous approach and infer fluid motion. Further research in this area introduced more general MP layers Sanchez-Gonzalez et al. (2018); Li et al. (2019b); Mrowca et al. (2018b), high-order time-integration Sanchez-Gonzalez et al. (2019) and hierarchical models (Li et al., 2019a; Mrowca et al., 2018a). To the best of our knowledge Alet, et al. (2019) Alet et al. (2019) were the first to explore the use of GNNs to infer Eulerian mechanics by solving the Poisson PDE. However, their domains remained simple, used coarse spatial discretisations and did not explore the generalisation of their model. Later, Belbute-Peres, et al. (2020) Belbute-Peres et al. (2020)

introduced an hybrid model consisting of a numerical solver providing low-resolution solutions to the N-S equations and several MP layers performing super-resolution on them. More closely related to our work, Pfaff, et al. (2020)

Pfaff et al. (2021) proposed a mesh-based GNN to simulate continuum mechanics, although they did not consider the use of MP at multiple scales of resolution.

3 Model

3.1 Model definition

MultiScaleGNN infers the temporal evolution of an -dimensional field, , in a spatial domain .111Model repository available on revealed on publication. It requires the discretisation of into a finite set of nodes , with coordinates . Given an input field , at time and at the nodes, a single evaluation of MultiScaleGNN returns , where is a fixed time-step size. Each time-step is performed by applying MP layers in graphs and between them, as illustrated in Figure 2.

Figure 2: MultiScaleGNN architecture. is an original high-resolution graph, is a lower-resolution graph obtained from , is the lowest-resolution graph. MP layers are applied in each graph, and, DownMP and UpMP layers between them.

The high-resolution graph consists of the set of nodes and a set of directed edges connecting these nodes. In a complete graph there exist edges, however, MP would be extremely expensive. Instead, MultiScaleGNN connects each node in (receiver node) to its -closest nodes (-sender nodes) using a k-nearest neighbors (k-NN) algorithm. The node attributes of each node are the concatenation of , and ; where is a set of physical parameters at (such as the Reynolds number in fluid dynamics) and


with representing the Dirichlet boundaries. Each edge attribute is assigned the relative position between sender node and receiver node . Node attributes and edge attributes are then encoded through two independent MLPs.

A MP layer applied to propagates the nodal and edge information only locally between connected (or neighbouring) nodes. Nevertheless, most continuous physical systems require this propagation at larger scales or even globally, for instance, in incompressible flows pressure changes propagate at infinite speed throughout the entire domain. Propagating the nodal and edge attributes between nodes separated by hundreds of hops would necessitate an excessive number of MP layers, which is neither efficient nor scalable. Instead, MultiScaleGNN processes the information at scales, creating a graph for each level and propagating information between them in each pass. The lower-resolution graphs (; with ) possess fewer nodes and edges, and hence, a single MP layer can propagate attributes over longer distances more efficiently. These graphs are described in section 3.2. As depicted in Figure 2, the information is first diffused and processed in the high-resolution graph through MP layers. It is then passed to through a downward MP (DownMP) layer. In the attributes are again processed through MP layers. This process is repeated times. The lowest resolution attributes (stored in ) are then passed back to the scale immediately above through an upward message passing (UpMP) layer. Attributes are successively passed through MP layers at scale and an UpMP layer from scale to scale until the information is ultimately processed in . Finally, a MLP decodes the nodal information to return the predicted field at time at the nodes. To apply MP in the graphs, MultiScaleGNN uses the MP layer summarised in equations (1) to (3), with the mean as the aggregation function.

3.2 Multi-scale graphs

Each graph with ; is obtained from graph by first dividing into a regular grid with cell size (Figure 3a exemplifies this for ). For each cell, provided that there is at least one node from on it, a node is added to . The nodes from on a given cell and the node from on the same cell are denotes as child nodes, , and parent node, , respectively. The coordinates of each parent node is the mean position of its children nodes. Each edge connects sender node to receiver node , provided that there exists at least one edge from to in . Edge is assigned the mean edge attribute of the set of edges going from to .

(a) Building from (b) DownMP and UpMP layers
Figure 3: (a) is obtained from by partitioning using a grid with cell size and assigning a parent node at the mean position of the child nodes in each cell. (b) DownMP and UpMP diagram.

Downward message-passing (DownMP).

To perform MP from to (see Figure 3b), a set of directed edges, , is created. Each edge connects node to its parent node , with edge attributes assigned as the relative position between child and parent nodes. A DownMP layer applies a common edge-update function, , to edge and node . It then assigns to each node attribute the mean updated attribute of all the edges in arriving to node , i.e.,


Upward message-passing (UpMP).

To pass and process the node attributes from to , MultiScaleGNN defines a set of directed edges, . These edges are the same as in , but with opposite direction. An UpMP layer applies a common edge-update function, , to each edge and both its sender (at scale ) and receiver (at scale ) nodes, directly updating the node attributes in , i.e.,


UpMP layers leave the edge attributes of unaltered. To model functions and we use MLPs.

3.3 Periodic boundaries

Figure 4: Function transforms a periodic direction into a circumference with unitary radius.

MultiLevelGNN imposes periodicity along and directions as an inductive bias on . For this purpose, each periodic direction is transformed through a function into a circumference with unitary radius,


where is the period along the spatial coordinate under consideration ( or direction). The k-NN algorithm that creates (see Figure 4) is then applied to the transformed coordinates. Even though is created based on nodal proximity in the transformed space, the edge attributes are still assigned the relative positions between nodes in the original coordinate system, except at the periodic boundaries where is added or subtracted to ensure a consistent representation.

3.4 Model training

MultiScaleGNN was trained by supervising at each time-point the loss between the predicted field and the ground-truth field obtained with a high-order finite element solver. includes the mean-squared error (MSE) at the nodes, the MAE at the nodes on Dirichlet-boundaries, and the MSE for the rate of change along the edges. can be expressed as


The hyper-parameters and are selected based on the physics to be learnt. Training details for our experiments can be found in Appendix B.

4 Training datasets

We generated datasets to train MultiScaleGNN to simulate advection and incompressible fluid dynamics.222Datasets available on revealed on publication.


Datasets AdvBox and AdvInBox contain simulations of a scalar field advected under a uniform velocity field (with magnitude in the range 0 to 1) on a square domain () and a rectangular domain () respectively. AdvBox domains have periodic conditions on all four boundaries, whereas AdvInBox domains have upper and lower periodic boundaries, a prescribed Dirichlet condition on the left boundary, and a zero-Neumann condition on the right boundary. The initial states at are derived from two-dimensional truncated Fourier series with random coefficients and a random number of terms. For each dataset, a new set of nodes

is generated at the beginning of every training epoch. For advection models,

is the advected field and are the two components the velocity field.

Incompressible fluid dynamics.

Dataset NS contains simulations of the periodic vortex shedding around a circular cylinder at Reynolds number . The upper and lower boundaries are periodic, hence, the physical problem is equivalent to an infinite vertical array of cylinders. The distance between cylinders, , is randomly sampled between and . Each domain is discretised into approximately 7000 nodes. For flow models, contains the velocity and pressure fields and is the Reynolds number. Further details of the training and testing datasets are included in Appendix A.

5 Results and discussion

We analyse the generalisation to unseen physical parameters, the effect of the number of scales and the independence to the spatial discretisation. All the results, except Tables 1 and 2, were obtained with 3-scale models. Model details and hyper-parameters are included in Appendix B.333Videos comparing the ground truth and the predictions can be found on


We first consider the MultiScaleGNN model trained to infer advection. Despite MultiScaleGNN being trained on square and rectangular domains and uniform velocity fields, it generalises to complex domains and non-uniform velocity fields (obtained using the N-S equations with ). As an example of a closed domain, we consider the Taylor-Couette flow in which the inner and outer boundaries rotate at different speeds. Figure 5a shows the ground truth and predictions for a simulation where the inner wall is moving faster than the outer wall in an anti-clockwise direction. In the Taylor-Couette flow, the structures present in the advected field increase in spatial frequency with time due to the shear flow. This challenges both the networks ability to capture accurate solutions as well as its ability to generalise. After 49 time-steps, the model maintains high accuracy in transporting both the lower and the higher frequencies. We attribute this to the model’s ability to process both high and low resolutions and the frequency range included in the training datasets —training datasets missing high-frequency structures resulted in a considerably higher diffusion.

(a) Advection in a Taylor-Couette flow (b) Advection around a body made of splines
Figure 5: Ground truth and MultiScaleGNN predictions for (a) the advection of scalar field in a Taylor-Couette flow [video], and (b) the advection around a body made of splines with periodicity along and [video].

We also evaluate the predictions of MultiScaleGNN on open domains containing obstacles of different shapes (circles, squares, ellipses and closed spline curves). Figure 5b shows the ground truth and predictions for a field advected around a body made of arbitrary splines, with upper-lower and left-right periodic boundaries. The predictions show a strong resemblance to the ground truth, although after 49 time-steps we note that the predicted fields begin to show a higher degree of diffusion in the wake behind the obstacle.

(a) (b)
Figure 6: Ground truth and MultiScaleGNN predictions for the advected field after 99 time-steps pass (a) squared and circular obstacles [video], (b) circular obstacles [video]. Upper-lower and left-right boundaries are periodic.

We also assess the quality of long-term inference. Figures 6a and 6b show the ground truth and predictions after 99 time-steps for two obstacles with periodicity in the and directions. It can be seen that MultiScaleGNN successfully generalises to infer advection around multiple obstacles. However, the MAE increased approximately linearly every time-step. This limitation could be explained by the simplicity of the training datasets, and it could be mitigated by training on non-uniform velocity fields.

Figure 7: Ground truth and MultiScaleGNN predictions for the horizontal velocity field, , vertical velocity field, , and pressure field, , after 99 time-steps for [video].
(a) Reynolds vs MAE (b) Lift coefficient
Figure 8: (a) MAE over 99 time-points for different Reynolds numbers. The grey region indicates the interpolation region. (b) Lift coefficient for . The continuos lines are the ground truth, whereas the dotted lines are the predictions.

MultiScaleGNN was also trained to simulate unsteady incompressible fluid dynamics in a range of Reynolds numbers between 500 and 1000. It shows very good interpolation to unseen Reynolds numbers within this range. For instance, Figure 7 compares the ground truth and predicted fields after 99 time-steps for . The MAE increases for Reynolds numbers lower than 400 and higher than 1200, as illustrated in Figure 8a, which represents the MAE for a range of Reynolds numbers between 100 and 1500. Figure 8b shows the temporal evolution of the lift coefficient (vertical component of the integral of the pressure forces on the cylinder walls) for simulations with and (with ). It can be noticed that for and , both at the edge of our training range, the ground truth (continuous lines) and the predictions (dotted lines) match very well over the entire simulation time. For and the lift predictions are qualitatively correct but begin to differ slightly in amplitude and phase. The reason for the limited extrapolation may be the complexity of the N-S equations, which result into shorter wakes and higher frequencies as the Reynolds number is increased.

Multiple scales.

We evaluate the accuracy of MultiScaleGNN with and ; the architectural details of each model are included in Appendix B. Tables 1 and 2 collect the MAE for the last time-point and the mean of all the time-points on the testing datasets. Incompressible fluids have a global behaviour since pressure waves travel at an infinite speed. The addition of coarser layers helps the network learn this characteristic and achieve significantly lower errors. For dataset NSMidRe, within the training regime of , and NSHighRe there is a clear benefit from using four scales. Interestingly, dataset NSLowRe shows a clear gain only when using two levels. This may be in part due to this dataset being in the extrapolation regime of .

In contrast, for the advection datasets, the MAE does not substantially decrease when using an increasing number of levels. In advection, information is propagated only locally and at a finite speed. However, an accurate advection model must guarantee that it propagates the nodal information at least as fast as the velocity field driving the advection, similar to the Courant–Friedrichs–Lewy (CFL) condition used in numerical analysis. In MultiScaleGNN, the coarser graphs help to satisfy this condition, while the original graph maintains a detailed representation of structures in the advected fields. Thus, GNNs for simulating both global and local continuum physics can benefit from learning information at multiple scales. As a comparison to Pfaff, et al. (2020) Pfaff et al. (2021), a GNN with 16 sequential MP-layers (GN-Blocks) results in a MAE of on our NSMidRe dataset; whereas MultiScaleGNN with the same number and type of MP layers, but distributed among 3 scales, results in a lower MAE of .

Step 49 All Step 49 All Step 49 All Step 49 All
AdvTaylor 7.794 3.661 7.055 3.157 7.424 3.355 7.626 3.441
AdvCircle 3.508 1.758 3.119 1.547 2.986 1.475 3.486 1.629
AdvCircleAng 3.880 1.923 3.618 1.762 3.464 1.689 3.493 1.690
AdvSquare 3.765 1.853 3.493 1.689 3.279 1.571 3.219 1.556
AdvEllipseH 3.508 1.758 3.119 1.547 2.986 1.475 3.486 1.629
AdvEllipseV 3.770 1.867 3.748 1.853 3.450 1.708 3.334 1.653
AdvSplines 4.021 1.957 3.960 1.876 3.720 1.758 3.599 1.705
AdvInCir 9.886 6.008 10.755 6.298 14.948 8.491 9.670 5.808
Table 1: MAE on the advection testing sets for MultiScaleGNN models with
Step 99 All Step 99 All Step 99 All Step 99 All
NSMidRe 13.028 8.033 4.731 3.666 3.95 3.201 3.693 3.032
NSLowRe 12.056 8.193 9.756 6.861 11.058 7.466 11.556 7.603
NSHighRe 16.176 11.42 9.949 7.385 8.936 7.17 9.085 6.822
Table 2: MAE on the N-S testing sets for MultiScaleGNN models with

Discretisation independence.

MultiScaleGNN processes the relative position between neighbouring nodes, but it does not consider the absolute position of nodes. It was also trained with random node distributions for each training example. This ensures MultiScaleGNN is independent to the set of nodes chosen to discretise the physical domain. To demonstrate this, we consider a simulation from the AdvCircleAng dataset (see Figure 1a). Figure 9a shows the MAE for a number of sets, with nodes evenly distributed on the domain. It can be seen that the MAE decreases linearly as the mean distance-to-neighbour is reduced – at least within the interval shown. Figure 9a also shows how the inference time per time-step increases exponentially as the spatial resolution is increased. For this reason, the sets we feed to MultiScaleGNN include a higher node count around solid walls. As an example, the nodes used for the simulation depicted in Figure 1a are represented in Figure 9b.

(a) (b)
Figure 9: (a) For sample in Figure 1a., the MAE over 49 time-points and the inference time per time-step for a number of quasi-uniform discretisations. (b) nodes used for the simulation shown in Figure 1a.

Adaptive remeshing is a technique used by numerical solvers to ensure sufficient resolution is present where needed in the computational domain. It dynamically modifies the spatial discretisation at every time-step. Likewise, we implemented an algorithm to locally increase or decrease the spatial resolution of before every model evaluation. Our algorithm increases the node count where the gradients of are higher and decreases it where these are lower. Figure 1b shows the field predicted by MultiScaleGNN combined with this technique. It can be seen that the resolution is reduced far from the wake and increased close to the cylinder walls and on the vortices travelling downstream. Adaptive coarsening/refinement allowed us to achieved an 11% reduction of the MAE at the 99 time-point on the NSMidRe dataset. More details are included in Appendix C.

Computational efficiency.

We compare MultiScaleGNN and the spectral/ element numerical solver Cantwell et al. (2015) used for generating the training and testing datasets. For advection, MultiScaleGNN simulations are one order of magnitude faster when they run on a CPU, and almost three orders of magnitude faster on a GPU. The inference time per time-step for AdvTaylor is around 30 ms, and around 90 ms for the remaining testing datasets on a Tesla T4 GPU. For the N-S equations, MultiScaleGNN achieves an speed-up of three orders of magnitude running on a CPU, and four orders of magnitude on a Tesla T4 GPU; with an inference time per time-step of 50 ms on a Tesla T4 GPU.

6 Conclusion

MultiScaleGNN is a novel multi-scale GNN model for inferring mechanics on continuous systems discretised into unstructured sets of nodes. Unstructured discretisations allow complex domains to be accurately represented and the node count to be adjusted over space. Multiple coarser levels allow high and low-resolution mechanics to be efficiently captured. For spatially local problems, such as advection, MP layers in coarse graphs guarantee that the information is diffused at the correct speed while MP layers in finer graphs provide sharp resolution of the fields. In global and local problems, such as incompressible fluid dynamics, the coarser graphs are particularly advantageous, since they enable global characteristics such as pressure waves to be effectively learnt. MultiScaleGNN interpolates to unseen spatial discretisations of the physical domains, allowing it to adopt efficient discretisations and to dynamically and locally modify them to further improve the accuracy. MultiScaleGNN also generalises to advection on complex domains and velocity fields and it interpolates to unseen Reynolds numbers in fluid dynamics. Inference is between two and four orders of magnitude faster than with the high-order solver used for generating the training datasets. This work is a significant advancement in the design of flexible, accurate and efficient neural simulators.


  • [1] F. Alet, A. K. Jeewajee, M. Bauza, A. Rodriguez, T. Lozano-Perez, and L. P. Kaelbling (2019) Graph element networks: adaptive, structured computation and memory. arXiv preprint arXiv:1904.09019. Cited by: §2.
  • [2] J. L. Ba, J. R. Kiros, and G. E. Hinton (2016) Layer normalization. arXiv preprint arXiv:1607.06450. Cited by: Appendix B.
  • [3] P. W. Battaglia, J. B. Hamrick, V. Bapst, A. Sanchez-Gonzalez, V. Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner, et al. (2018) Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261. Cited by: §1, §2.
  • [4] P. W. Battaglia, R. Pascanu, M. Lai, D. Rezende, and K. Kavukcuoglu (2016) Interaction networks for learning about objects, relations and physics. In 38th Neural Information Processing Systems, NeurIPS 2016, Cited by: §2, §2.
  • [5] F. d. A. Belbute-Peres, T. Economon, and Z. Kolter (2020) Combining differentiable pde solvers and graph neural networks for fluid flow prediction. In International Conference on Machine Learning, pp. 2402–2411. Cited by: §2.
  • [6] C. D. Cantwell, D. Moxey, A. Comerford, A. Bolis, G. Rocco, G. Mengaldo, D. De Grazia, S. Yakovlev, J. Lombard, D. Ekelschot, et al. (2015)

    Nektar++: an open-source spectral/hp element framework

    Computer physics communications 192, pp. 205–219. Cited by: §A.1, §1, §5.
  • [7] M. B. Chang, T. Ullman, A. Torralba, and J. B. Tenenbaum (2017) A compositional object-based approach to learning physical dynamics. In 5th International Conference on Learning Representations, ICLR 2017, Cited by: §2.
  • [8] S. Fotiadis, E. Pignatelli, M. Lino, C. Cantwell, A. Storkey, and A. A. Bharath (2020) Comparing recurrent and convolutional neural networks for predicting wave propagation. In ICLR Workshop on Deep Neural Models and Differential Equations, External Links: Link Cited by: §2.
  • [9] J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl (2017) Neural message passing for quantum chemistry. In 34th International Conference on Machine Learning, ICML 2017, Cited by: §2.
  • [10] I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio (2016) Deep learning. MIT press Cambridge. Cited by: §1.
  • [11] H. Guillard (1993) Node-nested multi-grid method with delaunay coarsening. Technical report INRIA. Cited by: Appendix C.
  • [12] X. Guo, W. Li, and F. Iorio (2016) Convolutional neural networks for steady flow approximation. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 481–490. Cited by: §1, §2.
  • [13] H. Jiang and L. Cheng (2021) Large-eddy simulation of flow past a circular cylinder for reynolds numbers 400 to 3900. Physics of Fluids 33 (3), pp. 034119. Cited by: §A.2.
  • [14] G. Karniadakis and S. Sherwin (2013) Spectral/hp element methods for computational fluid dynamics. Oxford University Press. Cited by: §A.1, §1.
  • [15] B. Kim, V. C. Azevedo, N. Thuerey, T. Kim, M. Gross, and B. Solenthaler (2019) Deep fluids: a generative network for parameterized fluid simulations. In Computer Graphics Forum, Cited by: §1, §2.
  • [16] D. P. Kingma and J. Ba (2015) Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Y. Bengio and Y. LeCun (Eds.), External Links: Link Cited by: Appendix B.
  • [17] G. Klambauer, T. Unterthiner, A. Mayr, and S. Hochreiter (2017) Self-normalizing neural networks. arXiv preprint arXiv:1706.02515. Cited by: Appendix B.
  • [18] S. Lee and D. You (2019) Data-driven prediction of unsteady flow over a circular cylinder using deep learning. Journal of Fluid Mechanics 879, pp. 217–254. Cited by: §2.
  • [19] Y. Li, J. Wu, R. Tedrake, J. B. Tenenbaum, and A. Torralba (2019) Learning particle dynamics for manipulating rigid bodies, deformable objects, and fluids. In 7th International Conference on Learning Representations, ICLR 2019, Cited by: §2.
  • [20] Y. Li, J. Wu, J. Y. Zhu, J. B. Tenenbaum, A. Torralba, and R. Tedrake (2019) Propagation networks for model-based control under partial observation. Proceedings - IEEE International Conference on Robotics and Automation 2019-May, pp. 1205–1211. External Links: Document, 1809.11169, ISBN 9781538660263, ISSN 10504729 Cited by: §2.
  • [21] M. Lino, C. Cantwell, S. Fotiadis, E. Pignatelli, and A. A. Bharath (2020) Simulating surface wave dynamics with convolutional networks. In NeurIPS Workshop on Interpretable Inductive Biases and Physically Structured Learning, Cited by: §1, §2.
  • [22] T. P. Miyanawala and R. K. Jaiman (2018-06) A novel deep learning method for the predictions of current forces on bluff bodies. In Proceedings of the ASME 2018 37th International Conference on Ocean, Offshore and Arctic Engineering, Cited by: §2.
  • [23] D. Mrowca, C. Zhuang, E. Wang, N. Haber, L. F. Fei-Fei, J. Tenenbaum, and D. L. Yamins (2018) Flexible neural representation for physics prediction. In Advances in neural information processing systems, pp. 8799–8810. Cited by: §2.
  • [24] D. Mrowca, C. Zhuang, E. Wang, N. Haber, L. Fei-Fei, J. B. Tenenbaum, and D. L.K. Yamins (2018) Flexible neural representation for physics prediction. Advances in Neural Information Processing Systems 2018-Decem, pp. 8799–8810. External Links: 1806.08047, ISSN 10495258 Cited by: §2.
  • [25] A. G. Özbay, S. Laizet, P. Tzirakis, G. Rizos, and B. Schuller (2019) Poisson CNN: Convolutional Neural Networks for the Solution of the Poisson Equation with Varying Meshes and Dirichlet Boundary Conditions. External Links: 1910.08613, Link Cited by: §2.
  • [26] T. Pfaff, M. Fortunato, A. Sanchez-Gonzalez, and P. W. Battaglia (2021) Learning mesh-based simulation with graph networks. In (accepted) 38th International Conference on Machine Learning, ICML 2021, Cited by: §2, §2, §5.
  • [27] A. Sanchez-Gonzalez, V. Bapst, K. Cranmer, and P. Battaglia (2019) Hamiltonian graph networks with ode integrators. arXiv preprint arXiv:1909.12790. Cited by: §2.
  • [28] A. Sanchez-Gonzalez, J. Godwin, T. Pfaff, R. Ying, J. Leskovec, and P. Battaglia (2020) Learning to simulate complex physics with graph networks. In International Conference on Machine Learning, pp. 8459–8468. Cited by: §1, §2, §2.
  • [29] A. Sanchez-Gonzalez, N. Heess, J. T. Springenberg, J. Merel, M. Riedmiller, R. Hadsell, and P. Battaglia (2018) Graph networks as learnable physics engines for inference and control. 35th International Conference on Machine Learning, ICML 2018 10, pp. 7097–7117. External Links: 1806.01242, ISBN 9781510867963 Cited by: §2.
  • [30] W. Sorteberg, S. Garasto, A. Pouplin, C. Cantwell, and A. A. Bharath (2018-12) Approximating the Solution to Wave Propagation using Deep Neural Networks. In NeurIPS Workshop on Modeling the Physical World: Perception, Learning, and Control, External Links: Link Cited by: §2.
  • [31] W. Tang, T. Shan, X. Dang, M. Li, F. Yang, S. Xu, and J. Wu (2018) Study on a Poisson’s equation solver based on deep learning technique. 2017 IEEE Electrical Design of Advanced Packaging and Systems Symposium, EDAPS 2017 2018-Janua, pp. 1–3. External Links: Document, 1712.05559, ISBN 9781538612385 Cited by: §2.
  • [32] N. Thuerey, K. Weissenow, L. Prantl, and X. Hu (2018-10) Deep Learning Methods for Reynolds-Averaged Navier-Stokes Simulations of Airfoil Flows. AIAA Journal 58 (1), pp. 15–26. External Links: Document, 1810.08217, ISSN 0001-1452, Link Cited by: §1, §2.
  • [33] J. Tompson, K. Schlachter, P. Sprechmann, and K. Perlin (2017) Accelerating eulerian fluid simulation with convolutional networks. In International Conference on Machine Learning, pp. 3424–3433. Cited by: §1.
  • [34] S. Wiewel, M. Becher, and N. Thuerey (2019) Latent space physics: towards learning the temporal evolution of fluid flow. In Computer Graphics Forum, Cited by: §1, §2.
  • [35] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and S. Y. Philip (2020) A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems. Cited by: §1.
  • [36] X. Xiao, Y. Zhou, H. Wang, and X. Yang (2018) A novel cnn-based poisson solver for fluid simulation. IEEE transactions on visualization and computer graphics 26 (3), pp. 1454–1465. Cited by: §1.
  • [37] E. Yilmaz and B. German (2017) A convolutional neural network approach to training predictors for airfoil performance. In 18th AIAA/ISSMO multidisciplinary analysis and optimization conference, pp. 3660. Cited by: §1, §2.
  • [38] Y. Zhang, W. J. Sung, and D. N. Mavris (2018) Application of convolutional neural network to predict airfoil lift coefficient. In 2018 AIAA/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference, pp. 1903. Cited by: §1, §2.

Appendix A Datasets details

a.1 Advection datasets

We solved the two-dimensional advection equation using Nektar++, an spectral/hp element solver [14, 6]. The advection equation reads


where is the advected field, and, and are the horizontal and vertical components of the velocity field respectively. As initial condition, , we take an scalar field derived from a two-dimensional Fourier series with random coefficients, specifically



are sampled from a uniform distribution between 0 and 1, and, integers

and are randomly selected between 3 and 8. In equation (11), and are the coordinates of the centre of the domain. The initial field is scaled to have a maximum equal to and a minimum equal to . Figure 10a shows an example of with and on a squared domain. We created training and testing datasets containing advection simulations with 50 time-points each, equispaced by a time-step size . A summary of these datasets can be found in Table 3.

(a) (b)
Figure 10: (a) Initial condition with with and from the AdvBox dataset. (b) One of the sets of nodes used during training.

Training datasets.

We generated two training datasets: AdvBox with 1500 simulations and AdvInBox with 3000 simulations. In these datasets we impose a uniform velocity fields with random values for and , but constrained to . In dataset AdvBox the domain is a square () with periodicity in and . In dataset AdvInBox the domain is a rectangle () with periodicity in , a Dirichlet condition on the left boundary and a homogeneous Neumann condition on the right boundary – as an additional constraint, . During training, a new set of nodes is selected at the beginning of every epoch. The node count was varied smoothly across the different regions of the domains, as illustrated in Figure 10b. The sets of node were created with Gmsh, a finite-element mesher. The element size parameter was set to 0.012 in the corners and the centre of the training domains, and set to and times that value at one random control point on each boundary. The mean number of nodes in the AdvBox and AdvInBox datasets are 98002 and 5009 respectively.

Testing datasets.

We generated eight testing datasets, each of them containing 200 simulations. These datasets consider advection on more complex open and closed domains with non-uniform velocity fields. The domains employed are represented in Figure 11, and the testing datasets are listed in Table 3. The velocity fields were obtained from the steady incompressible Navier-Stokes equations with . In dataset AdvTaylor the inner and outer walls spin at a velocity randomly sampled between and . In datasets AdvCircle, AdvSquare, AdvEllipseH, AdvEllipseV and AdvSplines there is periodicity along and , and a horizontal flow rate between and is imposed. The obstacles inside the domains on the AdvSplines dataset are made of closed spline curves defined from six random points. Dataset AdvCircleAng is similar to AdvCircle, but the flow rate forms an angle between and with the axis. The domain in dataset AdvInCir has periodicity along , a Dirichlet condition on the left boundary (with and ), and a homogeneous Neumann condition on the right boundary. The set of nodes were generated using Gmsh with an element size equal to 0.005 on the walls of the obstacles and 0.01 on the remaining boudaries.

minipage=scale=0.9 (a) (b) (c) (d) (e) (f)

Figure 11: Physical domains (black areas) on our testing datasets.
Dataset Flow type Domain #Nodes Train/Test
AdvBox Open, periodic in and 9601-10003 Training
AdvInBox Open, periodic in 4894-5135 Training
AdvTaylor Closed, Taylor-Couette flow Figure 11a 7207 Testing
AdvCircle Open, periodic in and Figure 11b 19862 Testing
AdvCircleAng Open, periodic in and Figure 11b 19862 Testing
AdvSquare Open, periodic in and Figure 11c 19956 Testing
AdvEllipseH Open, periodic in and Figure 11d 20210 Testing
AdvEllipseV Open, periodic in and Figure 11e 20221 Testing
AdvSplines Open, periodic in and Figure 11f 19316-20389 Testing
AdvIncir Open, periodic in Figure 11b 19862 Testing
Table 3: Advection training and testing datasets

a.2 Incompressible fluid dynamics datasets

We solved the two-dimensional incompressible Navier-Stokes equation using the high-order solver Nektar++. The Navier-Stokes equations read


where and are the horizontal and vertical components of the velocity field, is the pressure field, and is the Reynolds number. We consider the flow around an infinite vertical array of circular cylinder, with diameter , equispaced a distance randomly sampled between and . The width of the domain is and the cylinders axis is at from the left boundary. The left boundary is an inlet with , and ; the right boundary is an outlet with , and ; and, the cylinder walls have a no-slip condition. The numerical solutions obtained for this flow at different Reynolds numbers are well known [13]. In our simulations we select values that yield solutions in the laminar vortex-shedding regime, and we only include the periodic stage. The sets of nodes employed for each simulation were created using Gmsh placing more nodes around the cylinders walls (see Figure 12a). The mean number of nodes in these sets is 7143. Each simulation contains time-points equispaced by a time-step size . The training and testing datasets are listed in Table 4.

Dataset #Simulations Train/Test
NS 500-1000 1000 Training
NSMidRe 500-1000 250 Testing
NSLowRe 100-500 250 Testing
NSMidRe 1000-1500 250 Testing
Table 4: Incompressible flow training and testing datasets

Appendix B Model details

Hyper-parameters choice.

MultiScaleGNN’s hyper-parameters have different values for advection models and fluid dynamics models, Table 5

collects the values we employed. All MLPs use SELU activation functions

[17], and, batch normalisation [2]. Our results (except results in Tables 1 and 2) were obtained with , and, , , for advection; and , , for fluid dynamics.

Hyper-parameters Advection Fluid dynamics
4 8
0.02 0.15
0.04 0.30
0.08 0.60
#Layers in edge encoder 2 2
#Layers in node encoder 2 2
#Layers in node decoder 2 2
#Layers in edge MLPs 2 3
#Layers in node MLPs 2 2
#Layers in DownMP MLPs 2 3
#Layers in UpMP MLPs 2 3

#Neurons per layer

128 128
Table 5: MultiScaleGNN details

Multi-scale experiments.

The number of MP layers at each scale used for obtaining the results in Tables 1 and 2 are listed in the table below. Notice that the total number of MP layers at each scale , is .

Advection Fluid dynamics
Table 6: in the multi-scale experiments

Training details.

We trained MultiScaleGNN models on a internal cluster using 4 CPUs, 86GB of memory, and a RTX6000 GPU with 24GB. We fed 8 graphs per batch. First, each training iteration predicted a single time-point, and, every time the training loss decreased below a threshold (0.01 for advection and 0.005 for fluid dynamics) we increased the number of iterative time-steps by one, up to a limit of 10. We used the loss function given by equation (

8) with , and, for advection and for fluid dynamics. The initial time-point was randomly selected for each prediction, and, we added to the initial field noise following a uniform distribution between -0.01 and 0.01. After each time-step, the models’ weights were updated using the Adam optimiser with its standard parameters [16]. The learning rate was set to

and multiplied by 0.5 when the training loss did not decrease after five consecutive epochs, also, we applied gradient clipping to keep the Frobenius norm of the weights gradients below or equal to one.

Appendix C Adaptive coarsening/refinement

(a) Original (b) Coarsened/refined at time (c) MAE
Figure 12: (a) Original set of nodes and (b) set of nodes after applying our coarsening/refinement algorithm with and . (c) MAE on the NSMidRe dataset for MultiScaleGNN without adaptive coarsening/refinement and MultiScaleGNN combined with our coarsening/refinement algorithm.

We implemented an algorithm to locally coarsen and refine the sets of nodes inputted to MultiScaleGNN based on the gradients of the fields at these nodes. The input to the coarsening/refinement algorithm is a set of nodes , and the output is the same set with some of its nodes removed and some new nodes added to it. The coarsening happens in regions where the amount of nodes is more than enough to capture the spatial variations of the input field, whereas the refinement happens in regions where more resolution is required. The algorithm begins by connecting each node to its -closest nodes, we denote the resulting set of edges by . Then, is computed at each node. We define as


and, it summarises the mean magnitude of the gradients of field at each node.

The set of nodes , which contains the nodes with a value of below the -lowest value, is applied a coarsening algorithm; whereas the set of nodes , which contains the nodes with a value above the -highest, is applied a refinement algorithm. Values and can be chosen depending on the performance requirements. The coarsening is based on Guillard’s coarsening [11], and, it consists on removing the -closest nodes to node provided that node has not been removed yet. The refinement algorithm we implemented can be interpreted as an inverse of Guillard’s coarsening: for each node , triangles with node and its -closest nodes at their vertices are created, then, new nodes are added at the centroids of those triangles. Figure 12a shows an example of a set of nodes provided to our coarsening/refinement algorithm, and Figure 12b shows the resulting set of nodes for and . Notice that, in line with Guillard’s coarsening, the nodes on the boundaries of the domain are not considered by our algorithm. We trained MultiScaleGNN on the NS dataset with sets of nodes obtained for between 0.3 and 0.5, and between 0.05 and 0.15. Then, during inference time, the sets of nodes fed to MultiScaleGNN are coarsened/refined before every time-step. Figure 12c compares the MAE on the NSMidRe dataset for MultiScaleGNN without adaptive coarsening/refinement and MultiScaleGNN combined with our coarsening/refinement algorithm. For a fair comparison, adaptive-MultiScaleGNN predictions are interpolated to the primitive set of nodes, and the MAEs for both models are computed in the same set of nodes. From this figure, it is clear, that having the ability to dynamically adapt the resolution can help to further improve the long-term accuracy.