1 Introduction
Forecasting the spatiotemporal mechanics of continuous systems is a common requirement in many areas of science and engineering. Continuum mechanics models often describe the underlying physical laws by one or more partial differential equations (PDEs), whose complexity may preclude their analytic solution. Numerical methods are wellestablished for approximately solving PDEs by discretising the the governing PDEs to form an algebraic system which can be solved Karniadakis and Sherwin (2013); Cantwell et al. (2015)
. Such solvers can achieve high levels of accuracy, but they are computationally expensive. Deep learning techniques have recently been shown to accelerate physical simulations without significantly penalising accuracy. This has motivated an increasing interest in the use of deep neural networks (DNNs) to produce realistic realtime animations for computer graphics
Tompson et al. (2017); Kim et al. (2019); SanchezGonzalez et al. (2020) and to speedup simulations in engineering design and control Guo et al. (2016); Yilmaz and German (2017); Zhang et al. (2018); Thuerey et al. (2018).Most of the recent work on deep learning to infer continuous physics has focused on convolutional neural networks (CNNs)
Wiewel et al. (2019); Xiao et al. (2018); Lino et al. (2020). In part, the success of CNNs for these problems lies in their translation invariance and locality Goodfellow et al. (2016), which represent strong and desirable inductive biases for continuummechanics models. However, CNNs constrain input and output fields to be defined on rectangular domains representated by regular grids, which is not suitable for more complex domains. As for traditional numerical techniques, it is desirable to be able to adjust the resolution over space, devoting more effort where the physics are challenging to resolve, and less effort elsewhere. An alternative approach to applying deep learning to geometrically and topologically complex domains are graph neural networks (GNNs), which can also be designed to satisfy spatial invariance and locality Battaglia et al. (2018); Wu et al. (2020). In this paper we describe a novel approach to applying GNNs for accurately forecasting the evolution of physical systems in complex and irregular domains.(a) Advection  (b) Incompressible flow 
field predicted at a set of nodes obtained with adaptive coarsening/refinement, the bottom figure shows the interpolated field.
Contribution.
We propose MultiScaleGNN, a multiscale GNN model, to forecast the spatiotemporal evolution of continuous systems discretised as unstructured sets of nodes. Each scale processes the information at different resolutions, enabling the network to more accurately and efficiently capture complex physical systems, such as fluid dynamics, with the same number of learnable parameters as a single level GNN. We apply MultiScaleGNN to simulate two unsteady fluid dynamics problems: advection and incompressible fluid flow around a circular cylinder within the timeperiodic laminar vortexshedding regime. We evaluate the improvement in accuracy over long rollouts as the number of scales in the model are increased. Importantly, we show that the proposed model is independent of the spatial discretisation and the mean absolute error (MAE) decreases linearly as the distance between nodes is reduced. We exploit this to apply adaptive graph coarsening and refinement. We also demonstrate a technique to impose periodic boundary conditions as an inductive bias on the graph network connectivity. MultiScaleGNN simulations are between two and four orders of magnitude faster than the numerical solver used for generating the ground truth datasets, becoming a potential surrogate model for fast predictions.
2 Background and related work
Deep learning on graphs
A directed graph, , is defined as , where is a set of nodes and is a set of directed edges, each of them connecting a sender node to a receiver node . Each node
can be assigned a vector of node attributes
, and each edge can be assigned a vector of edge attributes . Gilmer, et al. (2017) Gilmer et al. (2017) introduced neural message passing (MP), a layer that updates the node attributes given the current node and edge attributes, as well as a set of learnable parameters. Battaglia et al. (2018) Battaglia et al. (2018) further generalised MP layers by introducing an edgeupdate function. Their MP layer performs three steps: edge update, aggregation and node update —equations (1), (2) and (3) respectively.(1)  
(2)  
(3) 
Functions and
are the edge and node update functions —typically multilayer perceptrons (MLPs). The aggregation function
is applied to each subset of updated edges going to node , and it must be permutation invariant. This MP layer has already proved successful in a range of datadriven physics problems SanchezGonzalez et al. (2020); Pfaff et al. (2021); Battaglia et al. (2016).CNNs for simulating continuous mechanics.
During the last five years, most of the DNNs used for predicting continuous physics have included convolutional layers. For instance, CNNs have been used to solve the Poisson’s equation (Tang et al., 2018; Özbay et al., 2019), to solve the steady NS equations (Guo et al., 2016; Yilmaz and German, 2017; Zhang et al., 2018; Thuerey et al., 2018; Miyanawala and Jaiman, 2018), and to simulate unsteady fluid flows (Kim et al., 2019; Wiewel et al., 2019; Lino et al., 2020; Lee and You, 2019; Sorteberg et al., 2018; Fotiadis et al., 2020). These CNNbased solvers are between one and four orders of magnitude faster than numerical solvers (Guo et al., 2016; Lino et al., 2020), and some of them have shown good extrapolation to unseen domain geometries and initial conditions (Thuerey et al., 2018; Lino et al., 2020).
Simulating physics with GNNs.
Recently, GNNs have been used to simulate the motion of discrete systems of solid particles Battaglia et al. (2016); Chang et al. (2017); SanchezGonzalez et al. (2018) and deformable solids and fluids discretised into lagrangian (or free) particles SanchezGonzalez et al. (2020); Li et al. (2019a); Mrowca et al. (2018a); Li et al. (2019b). Battaglia, et al. (2016) Battaglia et al. (2016) and Chang, et al. (2016) Chang et al. (2017) represented systems of particles as graphs, whose nodes encode the state of the particles and whose edges encode their pairwise interactions; nodes and edges are individually updated through a single MP layer to determine the evolution of the system. Of note is the work of Sanchez, et al. (2020), who discretised fluids into a set of free particles in order to apply the previous approach and infer fluid motion. Further research in this area introduced more general MP layers SanchezGonzalez et al. (2018); Li et al. (2019b); Mrowca et al. (2018b), highorder timeintegration SanchezGonzalez et al. (2019) and hierarchical models (Li et al., 2019a; Mrowca et al., 2018a). To the best of our knowledge Alet, et al. (2019) Alet et al. (2019) were the first to explore the use of GNNs to infer Eulerian mechanics by solving the Poisson PDE. However, their domains remained simple, used coarse spatial discretisations and did not explore the generalisation of their model. Later, BelbutePeres, et al. (2020) BelbutePeres et al. (2020)
introduced an hybrid model consisting of a numerical solver providing lowresolution solutions to the NS equations and several MP layers performing superresolution on them. More closely related to our work, Pfaff, et al. (2020)
Pfaff et al. (2021) proposed a meshbased GNN to simulate continuum mechanics, although they did not consider the use of MP at multiple scales of resolution.3 Model
3.1 Model definition
MultiScaleGNN infers the temporal evolution of an dimensional field, , in a spatial domain .^{1}^{1}1Model repository available on revealed on publication. It requires the discretisation of into a finite set of nodes , with coordinates . Given an input field , at time and at the nodes, a single evaluation of MultiScaleGNN returns , where is a fixed timestep size. Each timestep is performed by applying MP layers in graphs and between them, as illustrated in Figure 2.
The highresolution graph consists of the set of nodes and a set of directed edges connecting these nodes. In a complete graph there exist edges, however, MP would be extremely expensive. Instead, MultiScaleGNN connects each node in (receiver node) to its closest nodes (sender nodes) using a knearest neighbors (kNN) algorithm. The node attributes of each node are the concatenation of , and ; where is a set of physical parameters at (such as the Reynolds number in fluid dynamics) and
(4) 
with representing the Dirichlet boundaries. Each edge attribute is assigned the relative position between sender node and receiver node . Node attributes and edge attributes are then encoded through two independent MLPs.
A MP layer applied to propagates the nodal and edge information only locally between connected (or neighbouring) nodes. Nevertheless, most continuous physical systems require this propagation at larger scales or even globally, for instance, in incompressible flows pressure changes propagate at infinite speed throughout the entire domain. Propagating the nodal and edge attributes between nodes separated by hundreds of hops would necessitate an excessive number of MP layers, which is neither efficient nor scalable. Instead, MultiScaleGNN processes the information at scales, creating a graph for each level and propagating information between them in each pass. The lowerresolution graphs (; with ) possess fewer nodes and edges, and hence, a single MP layer can propagate attributes over longer distances more efficiently. These graphs are described in section 3.2. As depicted in Figure 2, the information is first diffused and processed in the highresolution graph through MP layers. It is then passed to through a downward MP (DownMP) layer. In the attributes are again processed through MP layers. This process is repeated times. The lowest resolution attributes (stored in ) are then passed back to the scale immediately above through an upward message passing (UpMP) layer. Attributes are successively passed through MP layers at scale and an UpMP layer from scale to scale until the information is ultimately processed in . Finally, a MLP decodes the nodal information to return the predicted field at time at the nodes. To apply MP in the graphs, MultiScaleGNN uses the MP layer summarised in equations (1) to (3), with the mean as the aggregation function.
3.2 Multiscale graphs
Each graph with ; is obtained from graph by first dividing into a regular grid with cell size (Figure 3a exemplifies this for ). For each cell, provided that there is at least one node from on it, a node is added to . The nodes from on a given cell and the node from on the same cell are denotes as child nodes, , and parent node, , respectively. The coordinates of each parent node is the mean position of its children nodes. Each edge connects sender node to receiver node , provided that there exists at least one edge from to in . Edge is assigned the mean edge attribute of the set of edges going from to .
(a) Building from  (b) DownMP and UpMP layers 
Downward messagepassing (DownMP).
To perform MP from to (see Figure 3b), a set of directed edges, , is created. Each edge connects node to its parent node , with edge attributes assigned as the relative position between child and parent nodes. A DownMP layer applies a common edgeupdate function, , to edge and node . It then assigns to each node attribute the mean updated attribute of all the edges in arriving to node , i.e.,
(5) 
Upward messagepassing (UpMP).
To pass and process the node attributes from to , MultiScaleGNN defines a set of directed edges, . These edges are the same as in , but with opposite direction. An UpMP layer applies a common edgeupdate function, , to each edge and both its sender (at scale ) and receiver (at scale ) nodes, directly updating the node attributes in , i.e.,
(6) 
UpMP layers leave the edge attributes of unaltered. To model functions and we use MLPs.
3.3 Periodic boundaries
MultiLevelGNN imposes periodicity along and directions as an inductive bias on . For this purpose, each periodic direction is transformed through a function into a circumference with unitary radius,
(7) 
where is the period along the spatial coordinate under consideration ( or direction). The kNN algorithm that creates (see Figure 4) is then applied to the transformed coordinates. Even though is created based on nodal proximity in the transformed space, the edge attributes are still assigned the relative positions between nodes in the original coordinate system, except at the periodic boundaries where is added or subtracted to ensure a consistent representation.
3.4 Model training
MultiScaleGNN was trained by supervising at each timepoint the loss between the predicted field and the groundtruth field obtained with a highorder finite element solver. includes the meansquared error (MSE) at the nodes, the MAE at the nodes on Dirichletboundaries, and the MSE for the rate of change along the edges. can be expressed as
(8) 
The hyperparameters and are selected based on the physics to be learnt. Training details for our experiments can be found in Appendix B.
4 Training datasets
We generated datasets to train MultiScaleGNN to simulate advection and incompressible fluid dynamics.^{2}^{2}2Datasets available on revealed on publication.
Advection.
Datasets AdvBox and AdvInBox contain simulations of a scalar field advected under a uniform velocity field (with magnitude in the range 0 to 1) on a square domain () and a rectangular domain () respectively. AdvBox domains have periodic conditions on all four boundaries, whereas AdvInBox domains have upper and lower periodic boundaries, a prescribed Dirichlet condition on the left boundary, and a zeroNeumann condition on the right boundary. The initial states at are derived from twodimensional truncated Fourier series with random coefficients and a random number of terms. For each dataset, a new set of nodes
is generated at the beginning of every training epoch. For advection models,
is the advected field and are the two components the velocity field.Incompressible fluid dynamics.
Dataset NS contains simulations of the periodic vortex shedding around a circular cylinder at Reynolds number . The upper and lower boundaries are periodic, hence, the physical problem is equivalent to an infinite vertical array of cylinders. The distance between cylinders, , is randomly sampled between and . Each domain is discretised into approximately 7000 nodes. For flow models, contains the velocity and pressure fields and is the Reynolds number. Further details of the training and testing datasets are included in Appendix A.
5 Results and discussion
We analyse the generalisation to unseen physical parameters, the effect of the number of scales and the independence to the spatial discretisation.
All the results, except Tables 1 and 2, were obtained with 3scale models.
Model details and hyperparameters are included in Appendix B.^{3}^{3}3Videos comparing the ground truth and the predictions can be found on
https://imperialcollegelondon.box.com/s/f6eqb25rt14mhacaysqn436g7bup3onz.
Generalisation.
We first consider the MultiScaleGNN model trained to infer advection. Despite MultiScaleGNN being trained on square and rectangular domains and uniform velocity fields, it generalises to complex domains and nonuniform velocity fields (obtained using the NS equations with ). As an example of a closed domain, we consider the TaylorCouette flow in which the inner and outer boundaries rotate at different speeds. Figure 5a shows the ground truth and predictions for a simulation where the inner wall is moving faster than the outer wall in an anticlockwise direction. In the TaylorCouette flow, the structures present in the advected field increase in spatial frequency with time due to the shear flow. This challenges both the networks ability to capture accurate solutions as well as its ability to generalise. After 49 timesteps, the model maintains high accuracy in transporting both the lower and the higher frequencies. We attribute this to the model’s ability to process both high and low resolutions and the frequency range included in the training datasets —training datasets missing highfrequency structures resulted in a considerably higher diffusion.
(a) Advection in a TaylorCouette flow  (b) Advection around a body made of splines 
We also evaluate the predictions of MultiScaleGNN on open domains containing obstacles of different shapes (circles, squares, ellipses and closed spline curves). Figure 5b shows the ground truth and predictions for a field advected around a body made of arbitrary splines, with upperlower and leftright periodic boundaries. The predictions show a strong resemblance to the ground truth, although after 49 timesteps we note that the predicted fields begin to show a higher degree of diffusion in the wake behind the obstacle.
(a)  (b) 
We also assess the quality of longterm inference. Figures 6a and 6b show the ground truth and predictions after 99 timesteps for two obstacles with periodicity in the and directions. It can be seen that MultiScaleGNN successfully generalises to infer advection around multiple obstacles. However, the MAE increased approximately linearly every timestep. This limitation could be explained by the simplicity of the training datasets, and it could be mitigated by training on nonuniform velocity fields.
(a) Reynolds vs MAE  (b) Lift coefficient 
MultiScaleGNN was also trained to simulate unsteady incompressible fluid dynamics in a range of Reynolds numbers between 500 and 1000. It shows very good interpolation to unseen Reynolds numbers within this range. For instance, Figure 7 compares the ground truth and predicted fields after 99 timesteps for . The MAE increases for Reynolds numbers lower than 400 and higher than 1200, as illustrated in Figure 8a, which represents the MAE for a range of Reynolds numbers between 100 and 1500. Figure 8b shows the temporal evolution of the lift coefficient (vertical component of the integral of the pressure forces on the cylinder walls) for simulations with and (with ). It can be noticed that for and , both at the edge of our training range, the ground truth (continuous lines) and the predictions (dotted lines) match very well over the entire simulation time. For and the lift predictions are qualitatively correct but begin to differ slightly in amplitude and phase. The reason for the limited extrapolation may be the complexity of the NS equations, which result into shorter wakes and higher frequencies as the Reynolds number is increased.
Multiple scales.
We evaluate the accuracy of MultiScaleGNN with and ; the architectural details of each model are included in Appendix B. Tables 1 and 2 collect the MAE for the last timepoint and the mean of all the timepoints on the testing datasets. Incompressible fluids have a global behaviour since pressure waves travel at an infinite speed. The addition of coarser layers helps the network learn this characteristic and achieve significantly lower errors. For dataset NSMidRe, within the training regime of , and NSHighRe there is a clear benefit from using four scales. Interestingly, dataset NSLowRe shows a clear gain only when using two levels. This may be in part due to this dataset being in the extrapolation regime of .
In contrast, for the advection datasets, the MAE does not substantially decrease when using an increasing number of levels. In advection, information is propagated only locally and at a finite speed. However, an accurate advection model must guarantee that it propagates the nodal information at least as fast as the velocity field driving the advection, similar to the Courant–Friedrichs–Lewy (CFL) condition used in numerical analysis. In MultiScaleGNN, the coarser graphs help to satisfy this condition, while the original graph maintains a detailed representation of structures in the advected fields. Thus, GNNs for simulating both global and local continuum physics can benefit from learning information at multiple scales. As a comparison to Pfaff, et al. (2020) Pfaff et al. (2021), a GNN with 16 sequential MPlayers (GNBlocks) results in a MAE of on our NSMidRe dataset; whereas MultiScaleGNN with the same number and type of MP layers, but distributed among 3 scales, results in a lower MAE of .
Datasets  

Step 49  All  Step 49  All  Step 49  All  Step 49  All  
AdvTaylor  7.794  3.661  7.055  3.157  7.424  3.355  7.626  3.441 
AdvCircle  3.508  1.758  3.119  1.547  2.986  1.475  3.486  1.629 
AdvCircleAng  3.880  1.923  3.618  1.762  3.464  1.689  3.493  1.690 
AdvSquare  3.765  1.853  3.493  1.689  3.279  1.571  3.219  1.556 
AdvEllipseH  3.508  1.758  3.119  1.547  2.986  1.475  3.486  1.629 
AdvEllipseV  3.770  1.867  3.748  1.853  3.450  1.708  3.334  1.653 
AdvSplines  4.021  1.957  3.960  1.876  3.720  1.758  3.599  1.705 
AdvInCir  9.886  6.008  10.755  6.298  14.948  8.491  9.670  5.808 
Datasets  

Step 99  All  Step 99  All  Step 99  All  Step 99  All  
NSMidRe  13.028  8.033  4.731  3.666  3.95  3.201  3.693  3.032 
NSLowRe  12.056  8.193  9.756  6.861  11.058  7.466  11.556  7.603 
NSHighRe  16.176  11.42  9.949  7.385  8.936  7.17  9.085  6.822 
Discretisation independence.
MultiScaleGNN processes the relative position between neighbouring nodes, but it does not consider the absolute position of nodes. It was also trained with random node distributions for each training example. This ensures MultiScaleGNN is independent to the set of nodes chosen to discretise the physical domain. To demonstrate this, we consider a simulation from the AdvCircleAng dataset (see Figure 1a). Figure 9a shows the MAE for a number of sets, with nodes evenly distributed on the domain. It can be seen that the MAE decreases linearly as the mean distancetoneighbour is reduced – at least within the interval shown. Figure 9a also shows how the inference time per timestep increases exponentially as the spatial resolution is increased. For this reason, the sets we feed to MultiScaleGNN include a higher node count around solid walls. As an example, the nodes used for the simulation depicted in Figure 1a are represented in Figure 9b.
(a)  (b) 
Adaptive remeshing is a technique used by numerical solvers to ensure sufficient resolution is present where needed in the computational domain. It dynamically modifies the spatial discretisation at every timestep. Likewise, we implemented an algorithm to locally increase or decrease the spatial resolution of before every model evaluation. Our algorithm increases the node count where the gradients of are higher and decreases it where these are lower. Figure 1b shows the field predicted by MultiScaleGNN combined with this technique. It can be seen that the resolution is reduced far from the wake and increased close to the cylinder walls and on the vortices travelling downstream. Adaptive coarsening/refinement allowed us to achieved an 11% reduction of the MAE at the 99 timepoint on the NSMidRe dataset. More details are included in Appendix C.
Computational efficiency.
We compare MultiScaleGNN and the spectral/ element numerical solver Cantwell et al. (2015) used for generating the training and testing datasets. For advection, MultiScaleGNN simulations are one order of magnitude faster when they run on a CPU, and almost three orders of magnitude faster on a GPU. The inference time per timestep for AdvTaylor is around 30 ms, and around 90 ms for the remaining testing datasets on a Tesla T4 GPU. For the NS equations, MultiScaleGNN achieves an speedup of three orders of magnitude running on a CPU, and four orders of magnitude on a Tesla T4 GPU; with an inference time per timestep of 50 ms on a Tesla T4 GPU.
6 Conclusion
MultiScaleGNN is a novel multiscale GNN model for inferring mechanics on continuous systems discretised into unstructured sets of nodes. Unstructured discretisations allow complex domains to be accurately represented and the node count to be adjusted over space. Multiple coarser levels allow high and lowresolution mechanics to be efficiently captured. For spatially local problems, such as advection, MP layers in coarse graphs guarantee that the information is diffused at the correct speed while MP layers in finer graphs provide sharp resolution of the fields. In global and local problems, such as incompressible fluid dynamics, the coarser graphs are particularly advantageous, since they enable global characteristics such as pressure waves to be effectively learnt. MultiScaleGNN interpolates to unseen spatial discretisations of the physical domains, allowing it to adopt efficient discretisations and to dynamically and locally modify them to further improve the accuracy. MultiScaleGNN also generalises to advection on complex domains and velocity fields and it interpolates to unseen Reynolds numbers in fluid dynamics. Inference is between two and four orders of magnitude faster than with the highorder solver used for generating the training datasets. This work is a significant advancement in the design of flexible, accurate and efficient neural simulators.
References
 [1] (2019) Graph element networks: adaptive, structured computation and memory. arXiv preprint arXiv:1904.09019. Cited by: §2.
 [2] (2016) Layer normalization. arXiv preprint arXiv:1607.06450. Cited by: Appendix B.
 [3] (2018) Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261. Cited by: §1, §2.
 [4] (2016) Interaction networks for learning about objects, relations and physics. In 38th Neural Information Processing Systems, NeurIPS 2016, Cited by: §2, §2.
 [5] (2020) Combining differentiable pde solvers and graph neural networks for fluid flow prediction. In International Conference on Machine Learning, pp. 2402–2411. Cited by: §2.

[6]
(2015)
Nektar++: an opensource spectral/hp element framework
. Computer physics communications 192, pp. 205–219. Cited by: §A.1, §1, §5.  [7] (2017) A compositional objectbased approach to learning physical dynamics. In 5th International Conference on Learning Representations, ICLR 2017, Cited by: §2.
 [8] (2020) Comparing recurrent and convolutional neural networks for predicting wave propagation. In ICLR Workshop on Deep Neural Models and Differential Equations, External Links: Link Cited by: §2.
 [9] (2017) Neural message passing for quantum chemistry. In 34th International Conference on Machine Learning, ICML 2017, Cited by: §2.
 [10] (2016) Deep learning. MIT press Cambridge. Cited by: §1.
 [11] (1993) Nodenested multigrid method with delaunay coarsening. Technical report INRIA. Cited by: Appendix C.
 [12] (2016) Convolutional neural networks for steady flow approximation. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 481–490. Cited by: §1, §2.
 [13] (2021) Largeeddy simulation of flow past a circular cylinder for reynolds numbers 400 to 3900. Physics of Fluids 33 (3), pp. 034119. Cited by: §A.2.
 [14] (2013) Spectral/hp element methods for computational fluid dynamics. Oxford University Press. Cited by: §A.1, §1.
 [15] (2019) Deep fluids: a generative network for parameterized fluid simulations. In Computer Graphics Forum, Cited by: §1, §2.
 [16] (2015) Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 79, 2015, Conference Track Proceedings, Y. Bengio and Y. LeCun (Eds.), External Links: Link Cited by: Appendix B.
 [17] (2017) Selfnormalizing neural networks. arXiv preprint arXiv:1706.02515. Cited by: Appendix B.
 [18] (2019) Datadriven prediction of unsteady flow over a circular cylinder using deep learning. Journal of Fluid Mechanics 879, pp. 217–254. Cited by: §2.
 [19] (2019) Learning particle dynamics for manipulating rigid bodies, deformable objects, and fluids. In 7th International Conference on Learning Representations, ICLR 2019, Cited by: §2.
 [20] (2019) Propagation networks for modelbased control under partial observation. Proceedings  IEEE International Conference on Robotics and Automation 2019May, pp. 1205–1211. External Links: Document, 1809.11169, ISBN 9781538660263, ISSN 10504729 Cited by: §2.
 [21] (2020) Simulating surface wave dynamics with convolutional networks. In NeurIPS Workshop on Interpretable Inductive Biases and Physically Structured Learning, Cited by: §1, §2.
 [22] (201806) A novel deep learning method for the predictions of current forces on bluff bodies. In Proceedings of the ASME 2018 37th International Conference on Ocean, Offshore and Arctic Engineering, Cited by: §2.
 [23] (2018) Flexible neural representation for physics prediction. In Advances in neural information processing systems, pp. 8799–8810. Cited by: §2.
 [24] (2018) Flexible neural representation for physics prediction. Advances in Neural Information Processing Systems 2018Decem, pp. 8799–8810. External Links: 1806.08047, ISSN 10495258 Cited by: §2.
 [25] (2019) Poisson CNN: Convolutional Neural Networks for the Solution of the Poisson Equation with Varying Meshes and Dirichlet Boundary Conditions. External Links: 1910.08613, Link Cited by: §2.
 [26] (2021) Learning meshbased simulation with graph networks. In (accepted) 38th International Conference on Machine Learning, ICML 2021, Cited by: §2, §2, §5.
 [27] (2019) Hamiltonian graph networks with ode integrators. arXiv preprint arXiv:1909.12790. Cited by: §2.
 [28] (2020) Learning to simulate complex physics with graph networks. In International Conference on Machine Learning, pp. 8459–8468. Cited by: §1, §2, §2.
 [29] (2018) Graph networks as learnable physics engines for inference and control. 35th International Conference on Machine Learning, ICML 2018 10, pp. 7097–7117. External Links: 1806.01242, ISBN 9781510867963 Cited by: §2.
 [30] (201812) Approximating the Solution to Wave Propagation using Deep Neural Networks. In NeurIPS Workshop on Modeling the Physical World: Perception, Learning, and Control, External Links: Link Cited by: §2.
 [31] (2018) Study on a Poisson’s equation solver based on deep learning technique. 2017 IEEE Electrical Design of Advanced Packaging and Systems Symposium, EDAPS 2017 2018Janua, pp. 1–3. External Links: Document, 1712.05559, ISBN 9781538612385 Cited by: §2.
 [32] (201810) Deep Learning Methods for ReynoldsAveraged NavierStokes Simulations of Airfoil Flows. AIAA Journal 58 (1), pp. 15–26. External Links: Document, 1810.08217, ISSN 00011452, Link Cited by: §1, §2.
 [33] (2017) Accelerating eulerian fluid simulation with convolutional networks. In International Conference on Machine Learning, pp. 3424–3433. Cited by: §1.
 [34] (2019) Latent space physics: towards learning the temporal evolution of fluid flow. In Computer Graphics Forum, Cited by: §1, §2.
 [35] (2020) A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems. Cited by: §1.
 [36] (2018) A novel cnnbased poisson solver for fluid simulation. IEEE transactions on visualization and computer graphics 26 (3), pp. 1454–1465. Cited by: §1.
 [37] (2017) A convolutional neural network approach to training predictors for airfoil performance. In 18th AIAA/ISSMO multidisciplinary analysis and optimization conference, pp. 3660. Cited by: §1, §2.
 [38] (2018) Application of convolutional neural network to predict airfoil lift coefficient. In 2018 AIAA/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference, pp. 1903. Cited by: §1, §2.
Appendix A Datasets details
a.1 Advection datasets
We solved the twodimensional advection equation using Nektar++, an spectral/hp element solver [14, 6]. The advection equation reads
(9) 
where is the advected field, and, and are the horizontal and vertical components of the velocity field respectively. As initial condition, , we take an scalar field derived from a twodimensional Fourier series with random coefficients, specifically
(10)  
(11) 
Coefficients
are sampled from a uniform distribution between 0 and 1, and, integers
and are randomly selected between 3 and 8. In equation (11), and are the coordinates of the centre of the domain. The initial field is scaled to have a maximum equal to and a minimum equal to . Figure 10a shows an example of with and on a squared domain. We created training and testing datasets containing advection simulations with 50 timepoints each, equispaced by a timestep size . A summary of these datasets can be found in Table 3.(a)  (b) 
Training datasets.
We generated two training datasets: AdvBox with 1500 simulations and AdvInBox with 3000 simulations. In these datasets we impose a uniform velocity fields with random values for and , but constrained to . In dataset AdvBox the domain is a square () with periodicity in and . In dataset AdvInBox the domain is a rectangle () with periodicity in , a Dirichlet condition on the left boundary and a homogeneous Neumann condition on the right boundary – as an additional constraint, . During training, a new set of nodes is selected at the beginning of every epoch. The node count was varied smoothly across the different regions of the domains, as illustrated in Figure 10b. The sets of node were created with Gmsh, a finiteelement mesher. The element size parameter was set to 0.012 in the corners and the centre of the training domains, and set to and times that value at one random control point on each boundary. The mean number of nodes in the AdvBox and AdvInBox datasets are 98002 and 5009 respectively.
Testing datasets.
We generated eight testing datasets, each of them containing 200 simulations. These datasets consider advection on more complex open and closed domains with nonuniform velocity fields. The domains employed are represented in Figure 11, and the testing datasets are listed in Table 3. The velocity fields were obtained from the steady incompressible NavierStokes equations with . In dataset AdvTaylor the inner and outer walls spin at a velocity randomly sampled between and . In datasets AdvCircle, AdvSquare, AdvEllipseH, AdvEllipseV and AdvSplines there is periodicity along and , and a horizontal flow rate between and is imposed. The obstacles inside the domains on the AdvSplines dataset are made of closed spline curves defined from six random points. Dataset AdvCircleAng is similar to AdvCircle, but the flow rate forms an angle between and with the axis. The domain in dataset AdvInCir has periodicity along , a Dirichlet condition on the left boundary (with and ), and a homogeneous Neumann condition on the right boundary. The set of nodes were generated using Gmsh with an element size equal to 0.005 on the walls of the obstacles and 0.01 on the remaining boudaries.
Dataset  Flow type  Domain  #Nodes  Train/Test 

AdvBox  Open, periodic in and  960110003  Training  
AdvInBox  Open, periodic in  48945135  Training  
AdvTaylor  Closed, TaylorCouette flow  Figure 11a  7207  Testing 
AdvCircle  Open, periodic in and  Figure 11b  19862  Testing 
AdvCircleAng  Open, periodic in and  Figure 11b  19862  Testing 
AdvSquare  Open, periodic in and  Figure 11c  19956  Testing 
AdvEllipseH  Open, periodic in and  Figure 11d  20210  Testing 
AdvEllipseV  Open, periodic in and  Figure 11e  20221  Testing 
AdvSplines  Open, periodic in and  Figure 11f  1931620389  Testing 
AdvIncir  Open, periodic in  Figure 11b  19862  Testing 
a.2 Incompressible fluid dynamics datasets
We solved the twodimensional incompressible NavierStokes equation using the highorder solver Nektar++. The NavierStokes equations read
(12)  
(13)  
(14) 
where and are the horizontal and vertical components of the velocity field, is the pressure field, and is the Reynolds number. We consider the flow around an infinite vertical array of circular cylinder, with diameter , equispaced a distance randomly sampled between and . The width of the domain is and the cylinders axis is at from the left boundary. The left boundary is an inlet with , and ; the right boundary is an outlet with , and ; and, the cylinder walls have a noslip condition. The numerical solutions obtained for this flow at different Reynolds numbers are well known [13]. In our simulations we select values that yield solutions in the laminar vortexshedding regime, and we only include the periodic stage. The sets of nodes employed for each simulation were created using Gmsh placing more nodes around the cylinders walls (see Figure 12a). The mean number of nodes in these sets is 7143. Each simulation contains timepoints equispaced by a timestep size . The training and testing datasets are listed in Table 4.
Dataset  #Simulations  Train/Test  

NS  5001000  1000  Training 
NSMidRe  5001000  250  Testing 
NSLowRe  100500  250  Testing 
NSMidRe  10001500  250  Testing 
Appendix B Model details
Hyperparameters choice.
MultiScaleGNN’s hyperparameters have different values for advection models and fluid dynamics models, Table 5
collects the values we employed. All MLPs use SELU activation functions
[17], and, batch normalisation [2]. Our results (except results in Tables 1 and 2) were obtained with , and, , , for advection; and , , for fluid dynamics.Hyperparameters  Advection  Fluid dynamics 
4  8  
0.02  0.15  
0.04  0.30  
0.08  0.60  
#Layers in edge encoder  2  2 
#Layers in node encoder  2  2 
#Layers in node decoder  2  2 
#Layers in edge MLPs  2  3 
#Layers in node MLPs  2  2 
#Layers in DownMP MLPs  2  3 
#Layers in UpMP MLPs  2  3 
#Neurons per layer 
128  128 
Multiscale experiments.
The number of MP layers at each scale used for obtaining the results in Tables 1 and 2 are listed in the table below. Notice that the total number of MP layers at each scale , is .
Advection  Fluid dynamics  

Training details.
We trained MultiScaleGNN models on a internal cluster using 4 CPUs, 86GB of memory, and a RTX6000 GPU with 24GB. We fed 8 graphs per batch. First, each training iteration predicted a single timepoint, and, every time the training loss decreased below a threshold (0.01 for advection and 0.005 for fluid dynamics) we increased the number of iterative timesteps by one, up to a limit of 10. We used the loss function given by equation (
8) with , and, for advection and for fluid dynamics. The initial timepoint was randomly selected for each prediction, and, we added to the initial field noise following a uniform distribution between 0.01 and 0.01. After each timestep, the models’ weights were updated using the Adam optimiser with its standard parameters [16]. The learning rate was set toand multiplied by 0.5 when the training loss did not decrease after five consecutive epochs, also, we applied gradient clipping to keep the Frobenius norm of the weights gradients below or equal to one.
Appendix C Adaptive coarsening/refinement
(a) Original  (b) Coarsened/refined at time  (c) MAE 
We implemented an algorithm to locally coarsen and refine the sets of nodes inputted to MultiScaleGNN based on the gradients of the fields at these nodes. The input to the coarsening/refinement algorithm is a set of nodes , and the output is the same set with some of its nodes removed and some new nodes added to it. The coarsening happens in regions where the amount of nodes is more than enough to capture the spatial variations of the input field, whereas the refinement happens in regions where more resolution is required. The algorithm begins by connecting each node to its closest nodes, we denote the resulting set of edges by . Then, is computed at each node. We define as
(15)  
(16) 
and, it summarises the mean magnitude of the gradients of field at each node.
The set of nodes , which contains the nodes with a value of below the lowest value, is applied a coarsening algorithm; whereas the set of nodes , which contains the nodes with a value above the highest, is applied a refinement algorithm. Values and can be chosen depending on the performance requirements. The coarsening is based on Guillard’s coarsening [11], and, it consists on removing the closest nodes to node provided that node has not been removed yet. The refinement algorithm we implemented can be interpreted as an inverse of Guillard’s coarsening: for each node , triangles with node and its closest nodes at their vertices are created, then, new nodes are added at the centroids of those triangles. Figure 12a shows an example of a set of nodes provided to our coarsening/refinement algorithm, and Figure 12b shows the resulting set of nodes for and . Notice that, in line with Guillard’s coarsening, the nodes on the boundaries of the domain are not considered by our algorithm. We trained MultiScaleGNN on the NS dataset with sets of nodes obtained for between 0.3 and 0.5, and between 0.05 and 0.15. Then, during inference time, the sets of nodes fed to MultiScaleGNN are coarsened/refined before every timestep. Figure 12c compares the MAE on the NSMidRe dataset for MultiScaleGNN without adaptive coarsening/refinement and MultiScaleGNN combined with our coarsening/refinement algorithm. For a fair comparison, adaptiveMultiScaleGNN predictions are interpolated to the primitive set of nodes, and the MAEs for both models are computed in the same set of nodes. From this figure, it is clear, that having the ability to dynamically adapt the resolution can help to further improve the longterm accuracy.
Comments
There are no comments yet.