The Galerkin-Finite Element (FE) method applied to partial differential equations (PDEs) with advection terms dominating over diffusion ones may suffer of numerical instability[hughes2012finite, Quarteroni_2017]. These numerical instabilities cause the numerical solution to exhibit oscillations that increase in amplitude with a local increment of the transport dominance over diffusion, i.e. as soon as the local Pèclet number becomes larger than one. In order to eliminate (or at least mitigate) numerical instabilities, a general employed strategy is to consider the generalized Galerkin method, i.e. the Galerkin method with additional stabilization terms [Quarteroni_2017]. Examples of stabilization methods to reduce numerical oscillations in the advection dominated regimes are for instance the upwind and streamline-diffusion method, both methods that are not strongly consistent. Instead, examples of strongly consistent methods are the Galerkin-Least-Squares, the Streamline Upwind Petrov-Galerkin (SUPG), and the Douglas-Wang methods [Roos_1996, Brooks_1982, Quarteroni_2017]. In particular, for these strongly consistent stabilization methods, the formulation depends on the residual of the PDE (in strong formulation), other than on a parameter – called stabilization parameter – whose definition is crucial for the success of the stabilization strategies. Determining the values of such stabilization parameter is not straightforward, especially for 2D and 3D problems. In addition, a universal formulation of such parameter is lacking, especially as it is strongly dependent on data of the model and the numerical setting used for the FE approximation of the PDEs, e.g. low and high order FE methods, spectral element methods, isogeometric methods, etc. [Schwab, spectralbook, cottrell2009isogeometric]. Some of the formulations for the stabilization parameter have been derived from analytical considerations on the differential problem in 1D, others are suitable only for a specific numerical approximation method, while the most comprehensive ones incorporate some (empirical) dependence on numerical setting, as e.g. the order of the FE [Bochev, Brooks_1982, CODINA200061, CODINA2008264, CODINA20072413, franca1992stabilized, galeao2004finite, REBOLLO2015406, ScovazziRossi, SCOVAZZI2007923, tezduyar2000finite].
In this work, we propose using Machine Learning (ML) [mitchell1997machine, Yegnanarayana, goodfellow2016deep] to make computers learn autonomously the values of the stabilization parameter for the FE approximation of advection dominated PDEs. Artificial Neural Newtorks (ANNs) [university1988continuous, Kutyniok_2021]
, are widely popular in ML and Deep Learning for a wide array of applications: they are indeed very versatile tools that are increasingly finding their way in scientific computing[Neittaanmäki202227], especially in the context of numerical approximation of PDEs [Mishra_2018]. For instance, as substitute to standard numerical methods, ANNs can be employed as meshless methods in Physics Informed Neural Networks (PINNs) to directly approximate the solution of the PDE as it is trained by minimizing the (strong) residual of the PDE [raissi2017machine, raissi2018hidden, raissi2019physics]. ANNs are also largely employed in a data-driven fashion in the context of model order reduction for parametric PDEs [fresca2021comprehensive, regazzoni2019machine, hesthaven2018non, guo2019data, zancanaro2021hybrid] and to enhance the stability properties of numerical methods for PDEs [discacciati2020controlling]
. In fluid dynamics modelling, ANNs are massively adopted for flow features extractions, modelling, optimization and control. For flow features extractions through clustering and classification to classify wake topologies[annurev_wake]; for modeling fluid dynamics by reconstructing specific flows such as the near wall field in a turbulent flow [annurev_turbulent_wall] and for flow optimization [annurev_gliding], and control for aerodynamics applications [annurev_control]
. ANNs are also used in a data-driven framework as a manner for providing alternative closure models for RANS’ stress tensor[annurev_rans], for sub-grid scale models in LES turbulence models [janssensadvancing, janssenthesis, xie2019artificial, zhou2019subgrid, xie2020modeling], or for model learning input-output relationships in complex physical processes [REGAZZONI2020113268].
In this work, we use ANNs to learn the optimal stabilization parameter in advection dominated PDEs that are discretized by means of FE method: the goal is to enhance the accuracy of the SUPG FE method by optimally selecting the stabilization parameter under different data of the PDE and numerical settings of the FE method. We found that the proposed ANN-enhanced stabilization method allows to improve accuracy and stabilization properties of the numerical solution compared to those results obtained by analytical expressions of the stabilization parameter. The numerical results obtained shed light also on the possibility to apply the presented strategy to learn closure laws for stabilization and turbulence models of fluid dynamics, as for instance to learn the stabilization parameters in the Variational Multiscale–LES model [forti2015semi, bazilevs2007variational].
This work is organized as follows: in Section 2, we recall the SUPG stabilization method for advection-diffusion equations; in Section 3, we present our numerical strategy to compute an optimal SUPG stabilization parameter through a feed-forward fully connected ANN. In Section 4 we validate our method by comparing the ANN results with those obtained with the 1D advection-diffusion problem from which the expression of the theoretical stabilization parameter has been derived. In Section 5 we first show the ANN’s training and we present our numerical results on the 2D advection-diffusion problem used for training and we finally generalize our findings using the ANN’s prediction on a different advection-diffusion problem. Finally, in Section 6 we draw our conclusions highlighting possible future developments.
2 The SUPG method for advection-diffusion problems
We briefly recall the advection-diffusion differential problem and the SUPG stabilization method for the advection dominated regime.
Let be the physical domain with being its boundary. We consider the following problem in the unknown function :
where , , and are assigned functions or constants, with , , with , and . The Dirichlet datum on the boundary is .
Let and , the weak formulation of Eq. (1) reads
with the bilinear form and the linear functional respectively defined as:
We consider a family of function spaces (either for and ) dependent on a parameter such that . Let be the function space of the FE discretization with piecewise Lagrange polynomials of degree , a triangulation of and the characteristic size of the mesh, comprised of elements . By setting , the Galerkin FE method applied to Eq. (2) reads
where . The standard Galerkin-FE method in Eq. (3), can generate numerical oscillations on if the problem is dominated by the advection term. In particular, these numerical instabilities can arise if the local Péclet number is , where
The generalized Galerkin method ([QV94]) allows eliminating or mitigating numerical oscillations by adding stabilization terms to the standard Galerkin formulation as
In the SUPG method, the additional stabilization term reads:
where stands as the residual in strong formulation of Eq. (1), which is defined as:
The term , appearing in Eq. (6), is the stabilization parameter, which is the focus of this work. A universal and optimal definition of in terms of the problem data, numerical settings like the mesh size and FE degree is lacking. An extensive review of stabilization parameters for the SUPG method is reported in [john2007spurious]. The most common choices for come from an analytic derivation made for the advection-diffusion problem in 1D with and approximated by means of linear finite elements, which reads:
where is the upwind function:
The stabilization parameter is generally defined locally, i.e. mesh element by element. If a uniform mesh is used, as we consider in this paper, then the value of is uniform over ; if in addition is a constant, then this implies that is uniform over . The choice of made in Eq. (7) represents an optimal choice of the stabilization parameter as it yields a nodally exact numerical solution for the 1D advection-diffusion problem if and the FE polynomial order is [Quarteroni_2017, galeao2004finite]. Thus, the stabilization parameter of Eq. (7) may not be fully effective to provide the optimal stabilization for a general advection-diffusion problem, e.g. to guarantee a nodally exact solution in 2D/3D or when using FE degrees larger than . A commonly used generalization of the formula of Eq. (7) to higher FE degrees () is presented in [galeao2004finite] and reads:
Differently from Eq. (7), the stabilization parameter in Eq. (8) takes into account the contribute of higher polynomial degrees . Still, this formula is not optimal as it does not guarantee a nodally exact numerical solution. Our goal is to find a general and optimal expression of the stabilization parameter holding for advection-diffusion problems and FE approximations of degree to be dependent on: the dimension , the FE degree , the mesh size , the forcing term , the diffusion coefficient and the transport coefficient .
3 ANN-based approach for determining the optimal stabilization parameter
We present our approach to determine the optimal stabilization parameter for the SUPG stabilization method by using an ANN. We consider as features (inputs) of our ANN: the degree of finite element , the mesh size and the global Péclet number of the advection-diffusion problem, where is the characteristic length of the problem. As we consider to be uniform and fixed, using as input corresponds to varying the value of the diffusion coefficient . The features (input) of the ANN read:
the target (output) of interest is the optimal SUPG stabilization parameter that we denote with :
The first step consist in generating the dataset, i.e. pairs of inputs and outputs to be used for training the ANN (data generation step). This consists in choosing repeatedly and randomly values of the features in given ranges. These features are used to feed an optimization problem that, through an optimizer and a suitable error measure, provide an optimal stabilization parameter , i.e. the target of the given feature. Such optimization problem considers as error measure : the mismatch between the numerical solution and the exact one over the nodes of the FE mesh. reads as:
being the -th node of the FE mesh and its number. To find the minimum of for different problem configurations we solve, for each instance of the input parameters , the following optimization problem:
Thus, the dataset generated consists of pairs of inputs-outputs , with . The latter is used for training the ANN. The latter will be then used to predict the optimal stabilization parameter to be used for the FE approximation of the advection-diffusion problem in the settings provided by .
We report in Figure 1 a sketch of the procedure used to build the ANN for the prediction of the optimal stabilization parameter for any new feature .
4 Numerical validation
In this section, we validate the proposed numerical strategy by means of a 1D advection-diffusion problem from which the expression of in Eq. (7) has been derived. In particular, we consider the advection-diffusion problem in Eq. (1) with , and with Dirichlet BCs on and on . It admits the following exact solution:
We plot in Figure 2 the error measure against the value of the stabilization parameter , with , and by using (Figure 1(a)) and (Figure 1(b)). We observe that shows a minimum in , thus suggesting the possibility to use an optimization algorithm to solve the problem. Specifically, we employ the L-BFGS-B optimization algorithm from SciPy
, an open source library forPython [scipy]. Instead, the advection-diffusion problem has been solved using the FE open source library FEniCS [fenics], by applying the SUPG method on a uniform mesh. Particularly, the optimal value found for the case corresponds to the one provided by the theory in Eq. (7). For the case , we observe a optimal value of for which the error is minimized: in this case, it does not corresponds to the stabilization parameter provided by the theory in Eq. (8): as a matter of fact, the latter arises from empirical considerations to extend the parameter in Eq. (7) to polynomials degrees . However, this does not ensure a nodally exact numerical solution.
We solve the optimization problem for different values of the local Péclet number and we plot in Figure 3 the optimal stabilization parameter against the local Péclet number for , and .
Figure 2(a) shows that, by employing a linear FE space , the optimal stabilization parameter obtained by minimizing exactly matches the theoretical one for every value of , thus confirming the findings of Figure 1(a). This also serves as validation of the optimization procedure that we proposed. By recalling that of Eq. (7) guarantees the numerical solution to be exact at nodes for this particular advection-diffusion problem and , we can infer that the optimization procedure is meaningful and it can be therefore exploited further, for example for . On the other hand, Figures 2(b) and 2(c) show instead a mismatch between the theoretical of Eq. (8) and the optimal one , thus confirming the findings reported in Figure 1(b).
In order to better appreciate the differences in the numerical solutions due to the choice of the stabilization parameters, we report, in Figure 4, a comparison of the error obtained by means of the optimal and theoretical stabilization parameters for and . Moreover, we report in Figure 5 a comparison among the exact solution , the SUPG-stabilized numerical solution obtained with the optimal stabilization parameter , and the numerical solution obtained with theoretical one . Specifically, we consider FE of degree and two different values of . In both these cases, the optimal parameter leads to a more accurate solution with respect to using the theoretical parameter . In particular, when is “small”, leads to overshooting in the numerical solution, while if is “large” the theoretical stabilization parameter leads to undershooting. Conversely, the optimal parameter accurately intercepts the exact solution at the nodes.
5 Numerical results
We first introduce the training set of our problem and we detail the setup of the ANN that we use in this work. Then, we show the prediction of the stabilization parameter by means of the ANN on different advection-diffusion problems.
5.1 Training the ANN for a 2D advection-diffusion problem
We apply the strategy presented so far to a 2D advection-diffusion problem to generate the dataset for the training of the ANN. Specifically, we consider in Eq. (1): , , and . We prescribe the following exact solution on the whole boundary :
We generate in a structured mesh of triangles with FEniCS [fenics], as shown in Figure 6. We generate the dataset by repeatedly solving the optimization problem of Eq. (12) for varying set of features as described in Section 3; specifically, we choose the features as reported in Eq. (9). The complete dataset contains examples with , and values of randomly chosend in the range
(uniform distribution), which yields. We summarize the details for generating the dataset in Table 1. Figure 7 provides an overview of the dataset used to the training of the ANN, specifically the features and the target . In this case, the characteristic length is fixed and set equal to .
|# Data||# Training||# Validation|
We train a fully-connected feed-forward ANN on the generated dataset by using the open source library Keras [keras] built on top of TensorFlow [tensorflow]. We divide the dataset into two parts: a training dataset that takes of the examples to be used for the ANN training and a validation dataset that takes the remaining
. We choose the loss function as the mean squared error that measures, for each training feature, the squared mismatch between the prediction of the ANNand the actual target . Specifically, the loss function is defined as
We normalize the features by subtracting their mean and dividing by their standard deviation in order to help the weights to better adapt to the different scales of the features. Moreover, the targets and the featureneed special care as built on a logarithmic scale; specifically, the features are normalized by applying base logarithm. The normalized features and targets read:
where , , are the mean of the training features , and the logarithm of respectively, while , and are their standard deviations.
In order to find the best performing ANN architecture, we carried out a study by testing different numbers of hidden layers, nodes per layer, optimization algorithm, its learning rate, and batch size. Finally, we choose an ANN with hidden layers,
nodes per layer and an output layer with a single node, all using a rectified linear unit (ReLU
) activation function. We display the ANN architecture in Figure8. We trained the ANN with the SGD optimization algorithm, a constant learning rate of , and mini-batch size of examples. Moreover, we employed a momentum of in the optimization algorithm to update the weights. The trained ANN is available in the GitLab repository [ann_repo].
5.2 Predictions of the stabilization parameter by ANN
Now, we compare the predictions of the ANN with the theoretical stabilization parameter of Eq. (8) [Quarteroni_2017, galeao2004finite]. We show in Figure 9 (left) the stabilization parameter predicted by the ANN by varying mesh size and global Péclet number . For comparison, we report in Figure 9(right) the corresponding theoretical stabilization parameter . We observe that the overall behavior of the trained ANN’s predictions are qualitatively similar of the theoretical stabilization parameter . Nevertheless, it can be inferred that the values of the stabilization parameters are almost completely matched with linear FE (), while they quantitatively differ for and . This was expected as for is an empirical extension of the formula for the case .
The ANN allows to make predictions with features outside the range of values for which it has been trained, that is for unseen values of such features. With this aim, we report in Figure 10 the comparison of the ANN’s predictions with for the FE degree . This comparison shows a clear difference between the theoretical and ANN’s for even though the general trend is maintained similar.
5.2.1 Predictions for the problem used in the ANN’s training
We compare the numerical solutions and obtained by means of the SUPG stabilization method with the parameter predicted by the ANN and the theoretical one , respectively; the comparison also involves the exact solution (13) of the 2D advection-diffusion problem used for the training of the ANN. In particular, we display the comparison of the former 2D solutions in Figure 11 along the line for any with: (a) , , and (Figure 10(a)); (b) , , and and (Figure 10(b)). We notice that in both the cases, the numerical solution involving the stabilization parameter provides more accurate results than with the theoretical stabilization paramter . In particular, in the case (a), involves a much smoother boundary layer and overshoots the exact solutions , conversely to the nearly nodally exact numerical solution . In the case (b), provides a much better representation of the bundary layer, without the undershooting of the solution exhibited by . Furthermore, we report in Table 2 absolute errors between the numerical solutions ( and ) and the exact one . The errors, computed in and norms, show that the ANN-based SUPG stabilization method very often produces more accurate results than the ones obtained with theoretical stabilization parameter , especially for .
5.2.2 Predictions for an unseen problem
We check the model generalization of the ANN trained for the 2D advection-diffusion problem with exact solution of Eq. 13 by predicting for an unseen 2D advection-diffusion problem. In particular, we consider the advection-diffusion problem of Eq. (1) in , with and . We prescribe the following exact solution on the whole boundary :
We compare in Figure 12 the numerical solutions and with the novel exact solution . As for the previous numerical tests, the stabilization parameter predicted by the network provides more accurate numerical solutions than with the theoretical stabilization parameter . In particular, the boundary layers and overall solution behaviours are better represented.
We better investigate assess the former qualitative consideration, by computing the error associated with the numerical solutions with SUPG stabilization for the parameters and ; different values of and are considered. In particular, Figure 13 compares against (in logarithmic scale) for different values of : the error obtained by using is always lower than the one achieved with . These results indicate that , although trained on a specific advection-diffusion problem, can be used to make predictions of the optimal for an unseen advection-diffusion problem in place of the theoretical stabilization parameter.
In this work, we presented an approach based on Machine Learning and ANN to compute the optimal stabilization parameter to be used in the SUPG FE approximation of advection-diffusion problems. Indeed, albeit the expression of the stabilization parameter is available for 1D problems and FE of degree , its extension to more general advection-diffusion problems (2D and 3D) and FE degrees is still lacking.
We validated our approach against the 1D case, for which the ANN-stabilization parameter matches the already optimal, theoretical one for , leading to nodally exact numerical solutions Instead, for higher polynomials degrees , remarkable differences are observed among the theoretical and optimal stabilization parameter: we observed better accuracy and stabilization properties of the numerical solution with the optimized stabilization parameter with respect to the theoretical one.
Then, we generated the dataset on a 2D advection-diffusion problem, and we used it to train a fully-connected feed-forward ANN. We applied the predictions of the network on the original advection-diffusion problem for unseen input values and we also apply it to an unseen advection-diffusion problem to check for model generalization. Our numerical results showed that the proposed ANN-based approach provides more accurate numerical solutions than using the theoretical stabilization parameter for the SUPG method.
This work represents a step towards the enhancements of stabilization methods for the FE approximation of advection-dominated differential problems. In particular, this work is limited to few features for the ANN-based stabilization parameter and provides as output a single optimal stabilization parameter meant for the whole mesh FE . Natural extensions of this work will therefore involve local stabilization parameters, i.e. element by element over the mesh , to better account for non-uniform meshes, differential problems with varying coefficients, and capturing the local behaviour of the solution. We envision extending the proposed Machine Learning approach to SUPG and variational multiscale stabilization methods for the FE approximation of the Navier-Stokes equations.
The authors acknowledge Prof. C. Canuto, Politecnico di Torino, for the fruitful discussions and useful suggestions. L.D. and A.Z. are members of the INdAM Research group GNCS. L.D. has been partially funded by the research project PRIN 2020, n.20204LN5N5, by MIUR.