1 Introduction
Turbulent flows are characterized by unsteadiness, chaoticlike flow states and high degree of nonlinearity. The structures involved exhibit a wide range of spatial and temporal scales, with the ratio of largest to smallest structures scaling with the Reynolds number[19]. In order to capture all scales of fluid motion directly, very fine computational meshes and time steps are required, which makes the computational effort in the case of engineeringrelevant (high Reynolds numbers) problems impossible to accomplish in reasonable time despite the rapidly increasing computer performance. To circumvent this problem, closures are used, which allow to model the structures that cannot be captured by the coarser numerical meshes. However, this advantage in computation time is paid for with a modeling error, which can be considerable depending on the chosen approach and the underlying flow case.
Recent developments in the field of machine learning (ML), which are largely driven by increased computational power as well as the availability of exceptionally large data sets, make it possible to address this issue, whereby different approaches can be taken. One obvious approach is MLbased improvement of the prediction quality of existing models, also known as ML augmented turbulence modeling. Here, one possibility is to calibrate the empirically determined constants of the respective models for the underlying use case by means of datadriven ML augmentation
[9, 14, 15, 73, 67, 66, 75, 74]. Pioneering publications in this area are the works of Ling et al. [42] and Jiang et al. [32]who used deep neural networks (DNNs) to determine the model constants of nonlinear algebraic eddy viscosity models and were thus able to significantly improve the prediction of anisotropic turbulence effects. Another way is the correction of existing models with the help of additional source terms, which were successfully used in
[52, 60, 61, 27] for the augmentation of turbulence models and in [71] for the augmentation of transition models.A completely different approach has been pursued recently, based on the generative adversarial networks (GAN) as introduced by Goodfellow [24], which allow a hierarchical identification and abstraction of features in images by means of deep neural networks (DNN). By the fact that also in the case of turbulent flows there is a complex superposition of different structures and scales suggests that these methods are well suited for learning the physical relationships in such flows. In [36, 37] it was shown that GAN are able to generate synthesizations of 2D flow fields after they have been previously trained based on DNS data. The reproductions even fulfilled some statistical constraints of turbulent flows such as Kolmogorov’s
law and the small scale intermittency of turbulence. Using a deep unsupervised learning approach and a combination of a GAN and a recurrent neural network (RNN), Kim & Lee
[34] were able to generate highresolution turbulent inlet boundary conditions at different Reynolds numbers, which show a statistical similarity to real flow fields.Another application of GAN is the field of superresolution reconstruction of turbulent flows. With these methods it is possible to synthetically scale up flow fields which are lowresolution or noisy due to the measurement technique used or, in the case of numerical data, due to limited data storage capacity.
[20, 21, 43, 11, 70, 68, 62]. These works assume a supervised learning approach, which means that labeled paired datasets of lowresolution and highresolution images must be available. Here, the lowresolution data sets are usually generated by filtering the highresolution data sets obtained, for example, from direct numerical simulations (DNS). In many practical situations, however, such highresolution data sets are usually not available, which to a certain extent limits the range of applicability. A more general and therefore more practical approach is the unsupervised superresolution reconstruction method. Here, pairwise data sets are no longer necessary, as Kim & Lee could show by successfully using an unsupervised GAN for the generation of boundary conditions for turbulent flow
[34]. Applications of such methods would be e.g. the augmentation or denoising of experimental data sets (PIV) or the derivation of subgridscale models for the application in the field of largeeddy simulation (LES).In our work, we show the possibilty of synthesizing turbulence structures of a similar quality as predicted by of LES with GAN trained from scratch and completely unsupervised. Thus, we are able to produce data with help of the trained generator by only having a noise vector as input. Moreover, we show by investigation of conditional GAN that generators of synthetic turbulent flows can learn to cope with changes of the geometry of the flowpath, e.g. caused by a rotation wake. We also show that introducing generative learning to model turbulences finds its justification in the enormous reduction of computational time compared to LES, while maintaining the resolution. Lastly, besides the practical aspects, we prove, using the mathematical concept of ergodicity, that learning to generate states of chaotic systems using GAN is possible.
Outline
The paper is organised as follows. In section 2 we briefly summarize the concept of ergodicity, discuss the mathematical foundations behind GAN along with the learning theory for deterministic ergodic systems. Also, a survey of modern GAN architectures is given. The hierarchy of datasets used for our experiments, ranging from the Lorenz attractor and the flow around a cylinder to a perioic wake impinging on a lowpressure turbine stator blade, are described in section 3. This is followed by section 4, where we give details on the training of our various GAN models. In section 5 we discuss the results of our numerical experiments. Finally, in section 6 we present the conclusion and an short outlook.
2 Methodology
In this work we apply generative adversarial networks (GAN) to generate typical states of a deterministic chaotic dynamic system. This is made mathematically precise via the notion of ergodicity [55].
2.1 Ergodicity
Let be
a probability space, consisting of a state space
, a collection of events/subsets of the state space known as algebra and a probability measure on that attributes the probability to the events . In our context, the state is chosen as the phase space of a dynamic system , , that fulfills and .Frequently in this work, we need the concept of an image measure, i.e. the transformation of a measure by a mapping. To this purpose, let be a mesurable mapping with respect to the algebra on and a second sigma algebra on , i.e. for all we have . The image probability measure of under , denoted by , is then defined by
(1) 
In the following, without further mention, we assume all mappings to be measurable with respect to suitable algebras.
In the case considered here, is a state space of a dynamic system. A dynamic system with the given state space consists of a collection of mappings that fulfill and , where . In many cases, like ours, the state of the dynamic system at time is obtained as a solution mapping
associated with a (discretized) ordinary or partial differential equation starting in the initial state
. E.g., for the case of the Lorenz attractor or with a large number of dimensions of the disrcetized state space of the fluid field in the case of the numerical simulation of turbulent fluids.The probability measure is an invariant measure for the dynamic system defined by , if all solution mappings are measure preserving with respect to , i.e. for all .
We next turn to the space of physical observables on and define it as space of all squareintegrable functions , i.e.
(2) 
We next turn to to the notion of ergodicity, which equates the time average of a dynamic system with the ensemble average of its invariant measure. In mathematical notation, ergodicity of the dynamic system with respect to the invariant measure is defined as
(3) 
Neumann [47] and Birkhoff [7] established quite general conditions, under which ergodicity holds. See also [16, 3] for extensive treatments of discrete and time continuous ergodic systems.
In some of our numerical experiments, we do not consider the entire statespace
, but reduce the degrees of freedom using a mapping
with the reduced state space. Let be the projected measure. Assuming the ergodictity of the original dynamics with respect to , we see that(4) 
whenever is square integrable with respect to . This easily follows from the general transformation formula and (3). Hence, ergodicity remains meaningful on the reduced state space , even if the dynamics can not be consistently formulated on .
2.2 Mathematical foundations of generative learning for ergodic systems
Generative Adversarial Networks (GAN) consist of two mappings  a generator and a discriminator . Here is a space of latent variables endowed with a probability measure that is easy to simulate, e.g. experiments uniform or Gaussian noise. The generator transforms the noise measure to the image measure . The goal of adversarial learning is, to learn a mapping from the feedback of the discriminator , such that is not able to distinguish synthetic samples from from real samples from the target measure . However, the discriminator
is a classifier that trained to assign real data a high probability of being real and synthetic data a low probability. If
has been so well trained, that even the best discriminator can not distinguish between samples from and , generative learning is successful, see also Fig. 1.In practice, both the generator and the discriminator are realized by neural networks. The feedback of to is transported backwards by backpropagation [58] through the concatenated mapping in order to train the weights of the neural network . At the same time, the universal approximation property of (deep) neural networks guarantees that any mappings and can be represented with a given precision, provided the architecture of the networks is sufficiently wide and deep, see [25, 5, 56, 64, 31, 76, 65, 33] for qualitative and quantitative results.
The training of GAN is organized as a twoplayer minimax game between and . Mathematically, it is described by the minmax optimization problem
(5) 
with the loss function, also known as binary crossentropy
[29](6) 
Here, the expected value is denoted by
, the random variable
with values in follows the distribution of the real world data and the latent random variable with values in follows the distribution of the noise measure . As has been observed in [25],(7) 
if the maximum is taken over a sufficiently large hypothesis space of discriminators. Here, stands for an information theoretic pseudo distance between the invariant measure and the generated measure known as the JensenShannon divergence
(8) 
with the KulbackLeibler pseudo distance between the measures and with continuous probability densities and , respectively. Note that holds if and only if holds with probability one and hence . Consequently, also measures the distance between and .
2.3 Learning theory for deterministic ergodic systems
In this work, we show that it is possible to model turbulent flows with GAN in practice. In this section we outline a proof that generative learning for deterministic ergodic systems converges in the limit of large observation time .
As described in section 2.2 is the unknown invariant measure encoding the statistical properties of the dynamic system with the initial state. Our goal is to sample from but since it is unknown, we want to learn it from the data given by the observed trajectory . Thus, in context of generative learning a generator is searched for which holds , where is, e.g., the Lebesgue measure than corresponds to dimensional uniform noise.
Let be the invariant measure of the dynamic system acting on the measurable space with the Borelalgebra and the sample space of state configurations with normalized state components in . It is assumed that with the continuous probability density in the space of times differentiable Hölder functions [1]. If this is not the case, one can easily regularize to achieve this. Moreover, we assume also lies in the space of Hölder functions , . By the realizability theorem of [5] it follows that , such that
(9) 
By knowing that is realizable in the hypotheses space
(10) 
for sufficiently large, our goal is to estimate by based on the data given by the ergodic flow .
The estimation of is performed using an empirical loss function that is designed to approximate the theoretical loss function (6) and hence minimizing the difference between the measure of the ergodic system and the image measure of the synthesized images. Mathematically, we search the generator
(11) 
with the discriminator hypotheses space such that an optimal choice of is feasible:
(12) 
Here, stands for the continuous probability density associated with the probability measure . We propose
(13) 
as empirical loss function for the ergodic system where denotes the rounding function.
Apparently, in the limit by ergodicity (3), the first term converges to the first term in (6
) whereas the second term converges almost surely by the law of large numbers. Therefore, the generator
that is learned from the empirical loss function (13) for large will approximately solve the minimax problem (5), which by (7) relates to the JensenShannon distance between the estimated measure and the invariant measure of the ergodic system. In particular, we obtain the following:Theorem 1.
Under the assumptions above it holds almost surely^{1}^{1}1w.r.t. the probability measure used for the sampling of the latent noise variables . that
(14) 
Proof.
Here we give a sketch of the proof. For a detailed argument in a related situation, see [5]. We introduce the following notation: is the discriminator solving and, likewise, , where we suppressed the suppressed the dependence of and to ease the notation. We obtain the estimate
(15) 
In the first equality we used (7). In the third line, the definition of was used and in the fourth line we applied (11). In the sixth line, we used the definition of . In the seventh line, we again used (7) and in the final line we applied (9), which is possible under the given assumptions as proven in [5].
It remains for us to show that the sampling error on the right hand side of (15) vanishes as . Note that we can decompose
(16) 
The second term on the right hand side vanishes by the uniform law of large numbers, as the hypothesis spaces and can be endowed with topologies that are, for , slightly little weaker than the topology. Nevertheless, the hypothesis spaces under these topologies are compact, see [5] for the details. Consequently, the expression in the first term vanishes by the standard uniform law of large numbers, see e.g. [17].
For the first term, we have already seen that ergodicity implies that the expressions in the absolute value by ergodicity vanish in the limit . Also, with respect to the aforementioned topologies the hypothesis spaces are compact. Last, it is easy to see that is equicontinuous in wrt. this topology (as is uniformly lower bounded away from zero in ). As for equicontinuous functions, pointwise convergence implies uniform convergence, the first term on the right hand side vanishes as well in the limit . ∎
We note that in practice, the Hölder generators and discriminators are replaced by deep neural networks. As such networks possess the universal approximation property, see e.g. [72], one can approximate the Hölder functions to arbitrary precision. Secondly, instead of solving the integral in (13) to compute the loss function, one uses a monte carlo approximation by sampling from the trajectory . Theorem 1 remains valid under this replacement, as one can see from one further application of the uniform law of large numbers.
Note however that these theoretical results do not guarantee the success of the numerical experiments. This is mostly due to the fact that the optimization problem (11) is highly nonconvex and can not be solved exactly, as e.g. for neural nets this problem is NPhard [59]. In practice, one rather finds sufficiently good local minima instead of a global optimum. Also, practical issues occur with the choice of the capacity and other elements of architecture of the neural networks.
2.4 Advanced GAN frameworks
After the introduction of the original GAN framework by Goodfellow figure 1, it became apparent that GAN are powerful models which can be applied to a wide variety of tasks by modifying or extending the architecture [51]. In this work three of these modified frameworks are investigated.
Wasserstein GAN (WGAN)
The Wasserstein GAN differs from the original GAN mainly in the change of the loss function and thus also in the change of the optimization problem [4]. For the WGAN framework the goal is not to minimize the Jensen Shannon divergence but the Wasserstein distance expressed by the KantorovichRubinstein duality
(17) 
with the supremum over all Lipschitz functions and a compact metric set. Under the satisfaction of certain conditions the authors of [4] showed that the optimization problem
(18) 
has a solution for and that the gradient of (17) exists.
In practice, the solution of (18) can be approximated by training a neural network parameterized by the weights with a compact space. This assumption implies that all parameterized functions are Lipschitz for some . To ensure that all weights lie in a compact space and thus the Lipschitz constraint is preserved, the weights are clipped [8] to a certain range after each gradient update in the implementation.
Deep Convolutional GAN (DCGAN)
The deep convolutional GAN has the same base architecture as shown in figure 1, but the generator and the discriminator
are convolutional neural networks (CNNs)
[56]. These kind of neural networks are especially in the field of image processing successfully applicable [2, 35]. In order to be able to integrate CNNs into GAN the authors of [56] pointed out which guidelines are to follow to enable a stable training at higher resolution and with deeper architectures.The stability of the training is ensured by applying batch normalization
[30] on the output layer of and the input layer of . To work with deeper architectures fullyconnected layers [45]should be avoided on top of convolutional features. Finally, the choice of the leaky rectified linear unit (LReLu) activation function
[49] for allows higher resolution modeling. Moreover, the generator captures faster the color space of the distribution by applying bounded activation functions in the last layer as the LReLu [49]. Finally, mentionable is that and are able to learn their own spatial up or downsampling by replacing deterministic spatial pooling layers [13]Conditional GAN (cGAN)
By conditioning a GAN framework with additional information it is possible to take the control over the data production process performed by the generator [46]. Thereby, additional information can be represented for example by class labels or semantic segmentation masks [23]. As shown in figure 2 the conditioning can be realized by feeding the supplementary information to the discriminator and the generator as an extra input channel. During training, is sampled from a data model , where gives the distribution of in the data generation process. This extension of the architecture leads to the modified loss function
(19) 
A special form of the cGAN investigated in this work is the so called pix2pixHD introduced by [64]. This conditional adversarial framework allows to generate highresolution photorealistic images from semantic segmentation masks. The pix2pixHD framework is based on its former version pix2pix [31] whose optimization problem is given as
(20) 
with defined as in (19). To improve the photorealism and the resolution of the generated images the architecture was changed by introducing three innovations.
First, a coarsetofine generator was implemented. For this, the generator was decomposed into the two subnetworks having the role of a global generator and as a local enhancer. By this the global and local information can be aggregated effectively within the generator for the image synthesis task.
In order for the discriminator to distinguish between generated and real highresolution images it needs a large receptive field. Therefore, the common discriminator was replaced by three multiscale discriminators and which have an identical network architecture, but operate at three different image scales. Hence, the optimization problem (20) extended to
(21) 
In particular, a pyramid of images is created during the training by downsampling the input image by factor two and four. Since the discriminator operating on the coarsest scale has the largest receptive field and hence a more global view it is possible to guide the generator producing globally consistent images. Whereas, the discriminator performing on the finest scale is able to make the generator pay attention to finer details during the data production.
3 Preparation of datasets
The datasets used for generative learning are described below. We proceed from the Lorentz attactor as a simple chaotic system to LES simulations of simple and complex turbulent flows.
3.1 Lorenz attractor
The Lorenz attractor is a nonperiodic, nonlinear and deterministic ergodic system which is given by the system of ordinary differential equations
[44] :(23)  
By [63] it has been proven that this dynamic system is representing a strange attractor. Within this hydrodynamic system describes the rate of convection, is proportional to the temperature variation between ascending and decreasing flow and represents the distortion rate of the vertical temperature profile from linearity [44].
The physical parameters are given by as the Prandtl number, as the relative Rayleigh number and representing the measure for the cell geometry. In this work we use the classic parameter values , and [40].
The training data for the generative learning is given by the points of the attractor’s trajectory within the three dimensional space computing the system (23) applying the odeint routine of the python package scipy.integrate which uses the lsoda algorithm [10]. In total data points of trajectories started from different initial points randomly sampled within the ranges , and .
3.2 Les
The computational fluid dynamics (CFD) results presented in this paper form the basis for GAN training. They were generated using largeeddy simulations (LES). In this approach, the spatially filtered variant of the NavierStokes equations is solved, with the computational grid designed to provide a resolution of at least 80% of the turbulent kinetic energy (TKE) of the flow. The effect of smaller turbulent structures, which are not captured by the grid, are represented using semiempirical models, the socalled subgrid scale models [28]. The spatial filter is thus implicitly given by the computational grid. The LES approach is reasonable, because it is the large vortex structures that transport the bulk of the energy [18] while the smaller structures can be considered to be mainly isotropic and homogeneous (not in the close vicinity of solid walls) by the assumption of local isotropy according to Kolmogorov [39], which simplifies their modeling considerably.
3.3 Testcases & numerical setup
Two different test cases were chosen for training of GAN, which differ in the complexity of the resulting flow field. Both simulations were performed with the commercial flow solver ANSYS Fluent which was set up to solve the incompressible variant of the spatially filtered NavierStokes equations. For time integration, a noniterative time advancement scheme is used in combination with a fractional step method for pressurevelocity coupling. The advective fluxes are treated by a bounded central scheme in order to introduce as low numerical dissipation as possible to avoid unphysical dampening of small turbulent structures [69].
3.3.1 Flow around a cylinder
The first test case is the flow around a cylinder at Reynolds number 3900. This is a widely used test case, which has been studied in great detail in the literature both experimentally [53, 48, 50] and numerically [53, 6, 22]. The flow field in this case is characterized by a Kármán vortex street, that forms in the wake region of the cylinder and consists of the typical coherent vortex system, where the axis of rotation of the individual vortices is parallel to the axis of the cylinder. A schematic representation of the numerical domain is shown in Fig. 2(a). The computational grid consists of a total of 15 million cells. The time step was chosen so that the CFL number was on the order of unity, and the simulation was run for a total of time steps after initial transient effects had disappeared, which corresponds to a total physical time period of approximately seconds.
3.3.2 T106 turbine stator under periodic wake impact
The second test case is an academic lowpressure turbine (LPT) stator under periodic wake impact. In this configuration, the wakes, which are comparable to those of the cylinder test case described above, are artificially generated by means of an upstream mounted rotating bar grid. The wakes are convected into the stator passages where deformation occurs as a consequence of the flow turning within the passage. Furthermore, a complex interaction between the wakes and the periodically detaching boundary layer takes place in the rear region of the suction side of the LPT stator, which in total makes this test case an interesting demonstrator for complex turbulent interaction phenomena. A schematic representation of the numerical domain is shown in Fig. 2(b). The computational grid consists of a total of approx. 72 million elements. The time step was chosen so that the CFL number was on the order of unity, and the simulation was run for a total of time steps after initial transient effects had disappeared, which corresponds to 10 bar passing periods or approx. .
3.4 Data sets and data production
The data sets used for training the GAN were generated by postprocessing the transient LES velocity field data. In this process, grayscale images are generated via a projection mapping in the sense of (4). In the case of the flow around a cylinder experiment, the gray scale is showing the distribution of the absolute deviation of the local fluctuating velocity magnitude at the location from its time average
(24) 
As the moving wake determines the turbulent flow field in the case of the LPT turbine, time averaging at a fixed point in this case does not make much sense. Therefore, a different representation of the turbulence (or projection mapping) is chosen, which simply depicts the velocity component perpendicular to the image, . Figure 4 shows an example image for each of the two test cases examined. The gray scale for is found in the upper left corner of the right panel. Negative values for are shown in lighter and positive values in darker grey.
Basic parameters of the generated data sets are summarized in table 1. The time step interval between two successive frames is chosen so that the respective snapshots are sufficiently far apart in time to minimize the correlation between the individual frames.
Sampling frequency  Image resolution  Number of files  Total size  
Cylinder  1000 x 600 px  
Turbine  1000 x 625 px 
3.5 Computational cost
At this point, the computational effort of the simulations presented in this paper should be briefly discussed, as this is the main criterion for the applicability of such scaleresolving simulations.
All simulations presented were performed on the inhouse HighPerformance Computing (HPC) cluster of the Chair of Thermal Turbomachines and Aero Engines, whose main specifications are summarized in table 2.
Partition  Number of nodes  Cores per node  CPU type  RAM  Interconnect 
#1  28  28  Intel Xeon "Skylake" Gold 6132 @2.6 GHz  96 GB  Intel Omnipath 
#2  8  40  Intel Xeon Scalable Gold 6248 @2.5 GHz  96 GB  Intel Omnipath 
TOTAL  36  1104  3.4 TB 
In total 20 computational nodes of the #1 partition of the HPC cluster were allocated in both runs, resulting in a total number of 560 CPU cores. In the case of of the flow around a cylinder, this resulted in a total computation time of about one day for the output run consisting of iterations, which corresponds to about 72 core weeks. In the case of T106 LPT stator, the calculation time was approx. 8 days for the output run consisting of time steps, which corresponds to 10 bar passings, i.e. approx. 640 core weeks.
4 Setup and configuration of GAN training
The implementations details of the training with the GAN frameworks introduced in section 2.4 are summarized for the different datasets in the following. All GAN were set up and trained using the PyTorch [54].
4.1 Lorenz attractor
The Lorenz attractor was trained by a original GAN with a discriminator consisting of four fully connected hidden layers [26] with and neurons. Since the attractor is a deterministic ergodic system [44] Gaussian noise was added to the network of the discriminator as well as to the real input data to regularize the training and hence reduce overfitting [5, 12]. The real data representing the training data is given by the points of the attractor’s trajectory within the three dimensional space as described in section 3.1.
The generator is also given by a fully connected neural network composed of three hidden layers with and neurons. Its input is given by a random vector of dimension
whose elements come from the standard normal Gaussian distribution.
Both neural networks and
apply the ReLu activation function for the input and hidden layers. The activation of the output layer of the discriminator is given by a sigmoid function and for the generator by a linear function.
The GAN framework was trained for epochs with a batch size of . Hence, the trajectory consisting of data points was regarded during one epoch whereby the trajectory started from different randomly sampled initial points lying in the ranges , and .
4.2 Flow around a cylinder
Experiments have been performed on this dataset using the original GAN, WGAN and DCGAN framework. For the original GAN and WGAN the discriminator is given by a fully connected neural network with five layers in total whereby the hidden layers consist of and neurons. The generator of both GAN frameworks also consists of five fully connected layers in total with the number of and neurons for the hidden layers. In exception of the output layer the Leaky ReLu is applied as activation function. The last layer of the generator is activated by the hyperbolic tangent function. For the original GAN the discriminators last layer is activated by the sigmoid function and the linear activation function is used in case of the WGAN. For the training of the DCGAN the architecture suggested by [41] was adopted.
The three investigated GAN frameworks take images of size as input. In our experiments we investigated the training with . We trained all GAN for epochs with a batch size of using images of the dataset. For further investigations the DCGAN training was continued up to epoch . The input vector of the generator consists of elements randomly sampled of the standard Gaussian distribution.
For the update of the weights, the Adam optimizer is applied in case of the original GAN and DCGAN with the parameter settings and and a learning rate of
is used. For the WGAN the weight update is realized by the optimizer RMSProp
[57] with a learning rate of whereby the weights are clipped to the range .4.3 T106 turbine stator under periodic wake impact
The DCGAN has been also trained for epochs and on the whole dataset of the wake disturbed turbine statorrow with the parameter settings described as in section 4.2.
Moreover, the pix2pixHD has been trained as second GAN framework with this dataset. As described in section 2.4 the pix2pixHD is a conditional GAN and hence incorporates additional information to the training. Here, this supplementary information is given by the binary segmentation masks shown in figure 5. In terms of conditional GANlearning (19
), this corresponds to a uniform distribution
over the coordinate of the wake. For the experiments with the pix2pixHD the implementation of [64] has been used with small changes. To avoid the appearance of artifacts in the data synthesized bywe replaced the reflection padding with a replication padding and add a replication padding to the global generator before the convolution during the downsampling procedure.
Contrary to the DCGAN framework it is possible to train the pix2pixHD on images of size . The only important thing to take care of is that and are divisible by . For this reason, the images were resized for the training to size , such that the aspect ratio has been preserved.
Since the GAN is trained in a conditioned fashion the binary masks are also needed during the inference. For this reason, the dataset was split into a training and test set. The training set contains the first images of the whole dataset and the test set consists of the remaining images.
The pix2pixHD has been trained for epochs with a batch size of . Analogous to the DCGAN the weights were updated by the Adam optimizer with the parameter , and a learning rate of .
5 Results of experiments
The results of the numerical experiments are presented and discussed in this section. In the following, we refer to the process of applying a trained generator to the latent random vector as inference. At inference time, the latent vector also consists of
elements sampled from the standard normal distribution.
5.1 Lorenz attractor
As described in section 4.1 we trained a original GAN for epochs in order to synthesize three dimensional data points which come from a trajectory of the Lorenz attractor that has converged towards the strange attractor. For consistency, a trajectory of real data points is considered at inference time as in the training. To get a better overview of the results, data points produced by the trained generator are shown in figure 6. It can be observed that the generated data points are on or close to the true trajectory of the Lorenz attractor. For the points that do not seem to lie directly on the trajectory, it has to be taken into account that the trajectory shown here is also not very dense due to the small number of data points. Considering randomly sampled real data points of a trajectory consisting of one million data points as it must be noted that the distribution is similar to the one of the synthesized data points. Moreover, it can be seen from the rotated figure 7
that, apart from a few outliers, the generated data points are all located in the area of the trajectory in threedimensional space.
5.2 Flow araound a cylinder
In order to generate the Kármáan vortex street, GAN frameworks with a simpler architecture have been considered first, namely the original GAN and the WGAN. As to observe in figure 8 the trained generators of both GAN are able to position the cylinder in the right place after epochs and that they try to synthesize the wake vortex. However, neither the original GAN nor the WGAN can capture the concrete structure of the vortex street. In addition, it is to observe that the color space has not been learned appropriate by the original GAN such that the generated images are significantly darker than the original images from the LES (see figure 3(a)). To address these issues, another GAN framework has been considered whose generator and discriminator are represented by convolutional neural networks. As already described in section 2.4, CNNs can be used particularly successfully in image processing. In our experiments, we also found that the DCGAN was able to capture the flow structures after epochs in contrast to the original GAN and the WGAN (see figure 8). To increase the quality of the synthesized images the DCGAN has been further trained out to epoch (see A for the training progress). Based on figure 9, it can be seen that the images produced by the generator of the DCGAN hardly differ from the real images from the LES after epochs of training.
Finally, it should be mentioned that the networks have been trained on images of size . It has been observed in our experiments that the quality of the generated images have been significantly better with increasing image resolution at inference time. Therefore, we present here the results for the training with images of size .
5.3 T106 turbine stator under periodic wake impact
Since we got impressive results from the DCGAN for the flow around a cylinder, we trained this GAN framework under the same parameter settings for the second test case. As we observe in figure 10, the LPT stator has been correctly positioned and the structure of the vortex flows has been also reasonably captured. However, at inference time, the generator has massive problems correctly capturing the position of the cylinder as it periodically slides from bottom to top over time. Especially by direct comparison in figure 11 we can observe, that the structures in the background are not properly captured and the synthesized images are significantly darker than the real images of the LES. To address these problems of the DCGAN we considered the pix2pixHD as another GAN framework. In order to have control over the position of the cylinder at inference, we feed binary segmentation masks shown in figure 5 as additional information to the GAN framework during training and at inference time (see section 2.4). These masks have the information about the position of the cylinder and the LPT stator. Moreover, we are allowed to generate high resolution images by the pix2pixHD framework such that the structure in the background of the images should also be preserved.
As shown in figure 12, using the generator from pix2pixHD we were able to generate images which again can be hardly qualitatively distinguished from the real image from the LES on a visual level after only epochs (see A for the training progress). It is also noticeable that the wake vortices do not look identical. Hence, the generator did not simply memorize the structure of the wake vortices at the respective positions and thus variation is given in the synthesized data.
5.4 Comparison of Computational Costs
Finally, the computational costs of the training and inference performed on a GPU of type Quadro RTX 8000 with GB of the successful GAN frameworks are reported in this section.
The training of the DCGAN with images of the dataset showing the flow around a cylinder has taken minutes per epoch. The computational time of pure inference is given by seconds per image. Thus, the production of a dataset containing images would take with the beforehand trained generator about minutes. This leads to a tremendous amount of time saved compared to one day needed for the generation of the images by the LES.
Since the pix2pixHD has a much more complex architecture than the DCGAN the training of one epoch with images has taken minutes. However, the computational time of pure inference is also given by only seconds per image. Hence, the production of images of the LPT stator under periodic wake impact would take about at inference. Thus, the saved computational time for the data production is very significant in comparison to days for the LES.
6 Conclusion and Outlook
We introduced generative adversarial networks as another way to model turbulence. In doing so, we showed that through generative learning it is possible to synthesize turbulence that matches the quality of LES images on a visual level while dramatically reducing computational time. Unlike previous work, we trained the GAN from scratch and only require a randomly samples noise vector for the data production in the unconditional case. For training and inference of conditional GAN, we also need binary segmentation masks which can be created manually and do not necessarily need to be obtained by simulations. Using conditional GAN, we have found a solution for generating visually highquality turbulence when solid objects as the rotation wake change position in space. Thus, we have provided a first approach to generalization with respect to spatial changes. Finally, we have also demonstrated that generative learning of ergodic systems also works at the theoretical level.
So far, we have evaluated our experiments at the visual level. In our future research, numerical experiments are of interest to measure also quantitatively the similarity between the LES images and the images generated by the GAN. Attention will also be given to exploring appropriate methods for making these comparisons. In addition, we have so far ignored the physics involved. Therefore, the next step is to feed the GAN with physical parameters so that turbulent flows can also be captured by the GAN in a physically correct manner. Having provided a first approach to generalization in terms of changes in turbulence space, in future work we will also consider how generalization can be realized in terms of geometries and further boundary conditions.
Acknowledgments
C.D. and H.G. thank Matthias Rottmann and Hayk Asatryan for discussion and useful advice. The authors also thank Pascal Post for valuable hints for the literature research.
Appendix A Training history of the GAN frameworks
The training progress of the experiments with the DCGAN discussed in section 5.2 is described in figure 13. Since we trained the GAN framework on images of size we also got images of this size as output during the training. It can be observed that the synthesized images already show a quite good quality after epochs. However, on closer inspection, it is noticeable that the structures of the vortex street become finer with an increasing number of training epochs and that the color space is also captured much better after . In figure 14 the development of the synthesized images during the training is illustrated for the pix2pixHD whose results are discussed in section 5.3. Similar to the DCGAN we can observe that the results improve significantly with increasing number of training epochs.
References
 [1] (2003) Sobolev spaces. Elsevier. Cited by: §2.3.
 [2] (2017) Understanding of a convolutional neural network. In 2017 International Conference on Engineering and Technology (ICET), pp. 1–6. External Links: Document Cited by: §2.4.
 [3] (2017) Ergodic theory, dynamic mode decomposition, and computation of spectral properties of the koopman operator. SIAM J. Appl. Dyn. Syst. 16, pp. 2096–2126. Cited by: §2.1.
 [4] (201706–11 Aug) Wasserstein generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning, D. Precup and Y. W. Teh (Eds.), Proceedings of Machine Learning Research, Vol. 70, pp. 214–223. External Links: Link Cited by: §2.4.
 [5] (2020) A convenient infinite dimensional framework for generative adversarial learning. arXiv preprint arXiv:2011.12087. Cited by: §2.2, §2.3, §2.3, §2.3, §4.1.
 [6] (1995) Numerical experiments on the flow past a circular cylinder at subcritical reynolds number. Ph.D. Thesis, Stanford University. Cited by: §3.3.1.
 [7] (1931) Proof of the ergodic theorem. Proceedings of the National Academy of Sciences 17 (12), pp. 656–660. External Links: Document, ISSN 00278424, Link, https://www.pnas.org/content/17/12/656.full.pdf Cited by: §2.1.

[8]
(2020)
Understanding gradient clipping in private sgd: a geometric perspective
. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33, pp. 13773–13782. External Links: Link Cited by: §2.4.  [9] (2011) Bayesian uncertainty analysis with applications to turbulence modeling. Reliability Engineering & System Safety 96 (9), pp. 1137–1149. Note: Quantification of Margins and Uncertainties External Links: ISSN 09518320, Document, Link Cited by: §1.
 [10] (20082021) SciPy documentation. Note: Accessed: 04.12.2021 External Links: Link Cited by: §3.1.

[11]
(2019)
Superresolution reconstruction of turbulent velocity fields using a generative adversarial networkbased artificial intelligence framework
. Physics of Fluids 31 (12), pp. 125111. External Links: Document, Link, https://doi.org/10.1063/1.5127031 Cited by: §1.  [12] (1995) Overfitting and undercomputing in machine learning. ACM computing surveys (CSUR) 27 (3), pp. 326–327. Cited by: §4.1.

[13]
(2018)
A guide to convolution arithmetic for deep learning
. External Links: 1603.07285 Cited by: §2.4.  [14] (2014) Predictive rans simulations via bayesian modelscenario averaging. Journal of Computational Physics 275, pp. 65–91. External Links: ISSN 00219991, Document, Link Cited by: §1.
 [15] (201402) Bayesian estimates of parameter variability in the turbulence model. Journal of Computational Physics 258, pp. 73–94. External Links: Document Cited by: §1.
 [16] (2015) Operator theoretic aspects of ergodic theory. Springer International Publishing, Cham. External Links: Link Cited by: §2.1.
 [17] (2017) A course in large sample theory. Routledge. Cited by: §2.3.
 [18] (2008) "Computational Methods for Fluid Dynamics". https://doi.org/10.1007/9783642560262Springer, Berlin. External Links: Document Cited by: §3.2.
 [19] (1995) Turbulence: the legacy of an kolmogorov. Cambridge university press. Cited by: §1.
 [20] (201905) Superresolution reconstruction of turbulent flows with machine learning. Journal of Fluid Mechanics 870, pp. 106–120. External Links: ISSN 14697645, Link, Document Cited by: §1.
 [21] (2020) Machine learning based spatiotemporal super resolution reconstruction of turbulent flows. External Links: 2004.11566 Cited by: §1.
 [22] (2000) Numerical studies of flow over a circular cylinder at red=3900. Physics of Fluids 12 (2), pp. 403–417. External Links: Document Cited by: §3.3.1.
 [23] (2017) A review on deep learning techniques applied to semantic segmentation. External Links: 1704.06857 Cited by: §2.4.
 [24] (2014) Generative adversarial networks. External Links: 1406.2661 Cited by: §1.
 [25] (201406) Generative adversarial networks. Advances in Neural Information Processing Systems 3, pp. . External Links: Document Cited by: Figure 1, §2.2, §2.2.
 [26] (1997) Neural network design. PWS Publishing Co.. Cited by: §4.1.
 [27] (201810) A data assimilation model for turbulent flows using continuous adjoint formulation. Physics of Fluids 30, pp. 105108. External Links: Document Cited by: §1.
 [28] (200701) "Numerical Computation of Internal and External Flows: The Fundamentals of Computational Fluid Dynamics". https://doi.org/10.1016/B9780750665940.X50371ButterworthHeinemann . External Links: Document Cited by: §3.2.
 [29] (2020) The realworldweight crossentropy loss function: modeling the costs of mislabeling. IEEE Access 8, pp. 4806–4813. External Links: Document Cited by: §2.2.
 [30] (201507–09 Jul) Batch normalization: accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning, F. Bach and D. Blei (Eds.), Proceedings of Machine Learning Research, Vol. 37, Lille, France, pp. 448–456. External Links: Link Cited by: §2.4.
 [31] (201707) Imagetoimage translation with conditional adversarial networks. pp. 5967–5976. External Links: Document Cited by: §2.2, §2.4.
 [32] (202001) A novel algebraic stress model with machinelearningassisted parameterization. Energies 13, pp. 258. External Links: Document Cited by: §1.

[33]
(2019)
A stylebased generator architecture for generative adversarial networks.
In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
, pp. 4401–4410. Cited by: §2.2.  [34] (2020) Deep unsupervised learning of turbulence for inflow generation at various reynolds numbers. Journal of Computational Physics 406, pp. 109216. External Links: ISSN 00219991, Document, Link Cited by: §1, §1.
 [35] (2017) Convolutional neural network. In MATLAB Deep Learning: With Machine Learning, Neural Networks and Artificial Intelligence, pp. 121–147. External Links: ISBN 9781484228456, Document, Link Cited by: §2.4.
 [36] (201711) Creating Turbulent Flow Realizations with Generative Adversarial Networks. In APS Division of Fluid Dynamics Meeting Abstracts, APS Meeting Abstracts, pp. A31.008. Cited by: §1.
 [37] (2018) From deep to physicsinformed learning of turbulence: diagnostics. External Links: 1810.07785 Cited by: §1.
 [38] (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980. Cited by: §4.1.
 [39] (1991) "The Local Structure of Turbulence in Incompressible Viscous Fluid for Very Large Reynolds Numbers". https://doi.org/10.2307/51980Proceedings: Mathematical and Physical Sciences 434 (1890), pp. 9–13. External Links: ISSN 09628444, Document Cited by: §3.2.
 [40] (2020) The lorenz system: hidden boundary of practical stability and the lyapunov dimension. Nonlinear Dynamics 102, pp. 713–732. Cited by: §3.1.
 [41] Pytorchgan. Note: https://github.com/eriklindernoren/PyTorchGANAccessed: 12.11.2021 Cited by: §4.2.
 [42] (2016) Reynolds averaged turbulence modelling using deep neural networks with embedded invariance. Journal of Fluid Mechanics 807, pp. 155–166. External Links: Document Cited by: §1.
 [43] (2020) Deep learning methods for superresolution reconstruction of turbulent flows. Physics of Fluids 32 (2), pp. 025105. External Links: Document, Link, https://doi.org/10.1063/1.5140772 Cited by: §1.
 [44] (1963) Deterministic nonperiodic flow. Journal of atmospheric sciences 20 (2), pp. 130–141. Cited by: §3.1, §4.1.
 [45] (2017) An equivalence of fully connected layer and convolutional layer. External Links: 1712.01252 Cited by: §2.4.
 [46] (2014) Conditional generative adversarial nets. External Links: 1411.1784 Cited by: Figure 2, §2.4.
 [47] (1932) Proof of the quasiergodic hypothesis. Proceedings of the National Academy of Sciences 18 (1), pp. 70–82. External Links: Document, ISSN 00278424, Link, https://www.pnas.org/content/18/1/70.full.pdf Cited by: §2.1.
 [48] (1994) An experimental investigation of the flow around a circular cylinder: influence of aspect ratio. Journal of Fluid Mechanics 258, pp. 287–316. External Links: Document Cited by: §3.3.1.
 [49] (2018) Activation functions: comparison of trends in practice and research for deep learning. External Links: 1811.03378 Cited by: §2.4.
 [50] (1996) The velocity field of the turbulent very near wake of a circular cylinder. Experiments in Fluids 20, pp. 441–453. Cited by: §3.3.1.
 [51] (2019) Recent progress on generative adversarial networks (gans): a survey. IEEE Access 7, pp. 36322–36333. External Links: Document Cited by: §2.4.
 [52] (2016) A paradigm for datadriven predictive modeling using field inversion and machine learning. Journal of Computational Physics 305, pp. 758–774. External Links: ISSN 00219991, Document, Link Cited by: §1.
 [53] (2008) Experimental and numerical studies of the flow over a circular cylinder at reynolds number 3900. Physics of Fluids 20 (8), pp. 085101. External Links: Document Cited by: §3.3.1.
 [54] (2019) PyTorch: an imperative style, highperformance deep learning library. In Advances in Neural Information Processing Systems 32, pp. 8024–8035. External Links: Link Cited by: §4.
 [55] (201912) The ergodicity problem in economics. Nature Physics 15, pp. 1216–1221. External Links: Document Cited by: §2.
 [56] (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 24, 2016, Conference Track Proceedings, Y. Bengio and Y. LeCun (Eds.), External Links: Link Cited by: §2.2, §2.4.
 [57] (2016) An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747. Cited by: §4.2.
 [58] (1986) Learning representations by backpropagating errors. Nature 323, pp. 533–536. Cited by: Figure 1, §2.2.
 [59] (2014) Understanding machine learning: from theory to algorithms. Cambridge university press. Cited by: §2.3.
 [60] (201604) Using field inversion to quantify functional errors in turbulence closures. Physics of Fluids 28, pp. 045110. External Links: Document Cited by: §1.
 [61] (201608) Machinelearningaugmented predictive modeling of turbulent separated flows over airfoils. AIAA Journal 55, pp. . External Links: Document Cited by: §1.
 [62] (2020) Turbulence enrichment using physicsinformed generative adversarial networks. External Links: 2003.01907 Cited by: §1.
 [63] (1999) The lorenz attractor exists. Comptes Rendus de l’Académie des Sciences  Series I  Mathematics 328 (12), pp. 1197–1202. External Links: ISSN 07644442, Document, Link Cited by: §3.1.
 [64] (2018) Highresolution image synthesis and semantic manipulation with conditional gans. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vol. , pp. 8798–8807. External Links: Document Cited by: §2.2, §2.4, §2.4, §4.3.
 [65] (2018) Esrgan: enhanced superresolution generative adversarial networks. In Proceedings of the European conference on computer vision (ECCV) workshops, Cited by: §2.2.

[66]
(2017)
The development of algebraic stress models using a novel evolutionary algorithm
. International Journal of Heat and Fluid Flow 68, pp. 298–318. External Links: ISSN 0142727X, Document, Link Cited by: §1.  [67] (2016) A novel evolutionary algorithm applied to algebraic modifications of the rans stress–strain relationship. Journal of Computational Physics 325, pp. 22–37. External Links: ISSN 00219991, Document, Link Cited by: §1.
 [68] (201907) A multipass gan for fluid flow superresolution. Proceedings of the ACM on Computer Graphics and Interactive Techniques 2 (2), pp. 1–21. External Links: ISSN 25776193, Link, Document Cited by: §1.
 [69] (2020) Large eddy simulation of periodic wake impact on boundary layer transition mechanisms on a highly loaded lowpressure turbine blade. In Turbo Expo: Power for Land, Sea, and Air, Vol. 84102, pp. V02ET41A013. Cited by: §3.3.
 [70] (201707) Datadriven synthesis of smoke flows with cnnbased feature descriptors. ACM Transactions on Graphics 36 (4), pp. 1–14. External Links: ISSN 15577368, Link, Document Cited by: §1.
 [71] (202006) Improving the transition model by the field inversion and machine learning framework. Physics of Fluids 32, pp. . External Links: Document Cited by: §1.
 [72] (2017) Error bounds for approximations with deep relu networks. Neural Networks 94, pp. 103–114. Cited by: §2.3.
 [73] (2018) An efficient bayesian uncertainty quantification approach with application to transition modeling. Computers & Fluids 161, pp. 211–224. External Links: ISSN 00457930, Document, Link Cited by: §1.
 [74] Cited by: §1.
 [75] (2020) RANS turbulence model development using cfddriven machine learning. Journal of Computational Physics 411, pp. 109413. External Links: ISSN 00219991, Document, Link Cited by: §1.
 [76] (2017) Unpaired imagetoimage translation using cycleconsistent adversarial networks. In Computer Vision (ICCV), 2017 IEEE International Conference on, Cited by: §2.2.
Comments
There are no comments yet.