Composite materials consist of multiple constituents on the micro scale. This leads to a heterogeneous microstructure in the sense of geometry as well as material behavior. In this work, composites with two constituents of linear elastic material are discussed. One constituent acts as a matrix, whereas the second is embedded in the former as inclusions. The geometry of the inclusions can either be long fibers, short fibers or particles Aboudi et al. (2012). The microstructure and the material behavior of its constituents determine the overall effective material behavior of the composite on the macro scale. To obtain the effective macro properties, homogenization techniques like analytical mean-field Böhm (2004) or numerical full-field approaches are used, where the later takes the microstructure into account explicitly. For complex microstructures such as short fiber inclusions, analytical mean-field methods perform poorly Müller et al. (2015). Full-field approaches on the other hand can deal with arbitrary microstructures Müller et al. (2015).
Microstructures as well as the material properties of the composite and underlying constituents are subjected to uncertainties, grounded in either intrinsic variety or induced by the manufacturing or measuring methods Kennedy and O’Hagan (2001); Der Kiureghian and Ditlevsen (2009). This is especially true for short fiber reinforced materials, where not only the microstructures and inclusion geometries are very complex, but also the measurement and imaging is challenging and dependent on many factors Hiriyur et al. (2011). Uncertainties on the micro scale lead to uncertain effective properties of the composite at the macro scale, which requires uncertainty quantification (UQ).
This works follows the framework of Sudret et al. (2017), where UQ consists of three steps. First, the sources of uncertainties (e.g. microstructures and material parameters) need to be quantified, e.g. by defining distributions of random variables based on measurements and expert knowledge. Second, an appropriate computational model of the problem studied needs to be defined. At last, the uncertainties are propagated through the model to achieve the distribution of the output, the so called quantity of interest (QoI). Universal techniques for this purpose are Monte Carlo-type methods (MC) including simple Monte Carlo and Latin Hyper-cube sampling Hurtado and Barbat (1998). Although they can be used on existing computational models by simply sampling from the input distribution and repeatedly running it, they suffer from low efficiency due to slow convergence Caflisch and others (1998).
To address this problem, so called surrogate models can be used instead Sudret et al. (2017). These apply some kind of dimension reduction e.g. on the input parameters. Examples are polynomial chaos expansion (PCE, Ghanem and Spanos (2003); Xiu and Karniadakis (2002)), low-rank tensor representations Vondřejc et al. (2020)
Uncertainties in the context of continuum micromechanics were studied
extensively in e.g. Soize (2008) . Here, a generic meso-scale
probability model for a large class of random anisotropic elastic
microstructures in the context of tensor-valued random fields is proposed, which
is not limited to a specific microstructure defined by its constituents. The
stochastic boundary value problem is solved using the stochastic finite element
method (FEM). In
. Here, a generic meso-scale probability model for a large class of random anisotropic elastic microstructures in the context of tensor-valued random fields is proposed, which is not limited to a specific microstructure defined by its constituents. The stochastic boundary value problem is solved using the stochastic finite element method (FEM). InCottereau et al. (2011), a dualistic deterministic / stochastic method is considered, utilizing two models coupled in an Arlequin framework. A framework using bounded random matrices for constitutive modeling is proposed in Noshadravan et al. (2013), which results in a surrogate, which can be calibrated to experimental data. In Guilleminot and Soize (2013), a new class of generalized nonparametric probabilistic models for matrix-valued non-Gaussian random fields is investigated, where the focus is on random field taking values in some subset of the set of real symmetric positive-definite matrices presenting sparsity and invariance with respect to given orthogonal transformations. An approach using stochastic potential in combination with a polynomial chaos representation for nonlinear constitutive equations of materials in the context of random microstrucutral geometry and high dimensional random parameters is given in Clément et al. (2013). PCE was further used in homogenization by Tootkaboni and Graham-Brady (2010) and Caylak et al. (2021). The later proposed an intrusive PCE in combination with FEM based full-field homogenization Miehe and Koch (2002) to determine uncertain effective material properties of transversely linear elastic fiber reinforced composites. The intrusive approach to PCE uses Galerkin projection, where the FEM algorithm needs to be reformulated using PC arithmetic Debusschere et al. (2004).
While in Caylak et al. (2021) cylindrical single fiber inclusions are homogenized, in this work the uncertain homogenization method is adjusted to more complex microstructures. For this it is convenient to replace the FEM homogenization scheme by a fast Fourier transform (FFT) Schneider (2021) based on the FFT Galerkin approach by Moulinec and Suquet (1995); de Geus et al. (2017). This method uses voxels and is more efficient in terms of CPU time and memory requirement than FEM, as shown in Cruzado et al. (2021) and Vondřejc and de Geus (2020). To adapt PCE to FFT, the Galerkin projection is replaced by the pseudospectral PCE Xiu (2007) to avoid PC arithmetics. Pseudospectral PCE uses numerical integration techniques to obtain the PC coefficients. From this surrogate model, the central moments of the QoI can be calculated. The pseudospectral approach is called a non-intrusive method, as only repeated deterministic solutions from the solver are needed instead of using PC arithmetic. Here, the computational bottleneck is the deterministic solver Ghanem and Red-Horse (2017).
To reduce the computational effort of the deterministic solution, data driven machine learning models trained on the deterministic solver can be used to replace the original modelBock et al. (2019)
. Popular approaches are decision trees, random forest ensembles, radial basis functions and support vector machines. While relatively easy to train, they are limited to either linear approximations or suffer from excessive parameters, especially in the case of large three dimensional image dataAggarwal (2018)
. An alternative approach is the use of artificial neural networks (ANN). ANN, especially in the context of deep learning, gained much attention, mainly because of its impact in fields like computer vision, speech recognition and autonomous drivingBock et al. (2019). The universal approximation theorem Hornik et al. (1989) states, that ANN can learn any Borel measurable function if the network has enough units. Additionally, there exist special architectures to effectively handle large image data. There are already several applications of ANN to computational continuum mechanics. For an overview the reader should refer to Bock et al. (2019). ANN were successfully used as surrogate models in the context of MC based uncertainty quantification of elliptic differential equations Tripathy and Bilionis (2018) and MC application on three dimensional homogenization Rao and Liu (2020). While there were some applications to homogenization, these were limited to either two dimensional microstructures Yang et al. (2018), uniaxial strain Beniwal et al. (2019) or fixed material parameters Ye et al. (2019); Frankel et al. (2019).
To the authors knowledge there was no attempt to apply ANN to homogenization of multiple three dimensional microstructures with uncertain material parameters and different loading directions. This work intends to close this gap by establishing a complete homogenization technique based on ANN. Due to the fast evaluation time of trained ANN, this homogenization technique is very efficient, as shown in this work. The ANN can then be used to carry out otherwise expensive stochastic investigations like pseudospectral PCE, where multiple deterministic solutions are necessary. In summary the key objectives and contributions of this work are:
Deep Learning homogenization algorithm: An ANN is trained to homogenize uncertain three dimensional complex microstructures with uncertain material parameters subjected to different loading directions.
FFT Label Generation: FFT is used to provide the labels needed for training of the ANN. This is done on voxel based three dimensional microstructures without meshing
UQ with pseudospectral PCE: A pseudospectral PCE using an ANN trained on FFT is used for UQ of the uncertain effective elasticity tensor of complex microstructures
The rest of the paper is structured as follows. In Section 2 the theoretical base of pseudospectral PCE, FFT based homogenization and ANN is given. Section 3 is concerned with the proposed homogenization model of using ANN in UQ in the context of pseudospectral PCE. Here the general problem formulation, data creation, network topology and training of the ANN as well as the UQ using pseudospectral PCE are discussed. Section 4 consists of three numerical experiments. One aims to compare the proposed method with Caylak et al. (2021), whereas the second is dedicated to a more complex microstructure, showing also the computational efficiency of the ANN accelerated approach to UQ. The third example investigates the abillity of the ANN to generalize to unseen microstructures. The paper closes with a summary and an outlook in Section 5.
This section gives essential prerequisites of random variables, uncertainty quantification, deep learning and numerical homogenization, which are used in the proposed algorithm in Section 3.
2.1 Random variables and uncertainty quantification
Let be a probability space Xiu (2010) with sample space , -algebra , and a probability measure on . A multivariate random input variable is defined by the map
where is an dimensional vector-space of realizations of elementary events by the map . A model then takes multivariate random input variables and maps them to multivariate random variables defined by the relation
where is an dimensional vector-space of realizations of elementary events by the composition function . The goal of uncertainty quantification (UQ) is to characterize the distribution of the output vector for a given input vector in Eq. (1), thus propagating input uncertainties through the model . Often the original model is computationally demanding. Therefore, surrogate models can be designed, which approximate the original model but take less computing time to evaluate, as
where is an dimensional vector-space of realizations of by the composition function . Due to approximations, the datasets and usually do not coincide. This means, that two mappings of the same realization by the original model in Eq. (2) and a surrogate model are not the same. The distance between and defines an error function Grimmett and Stirzaker (2020)
with expectation defined component wise as the integral
over a probability space . For a surrogate model to have a small error with respect to the original model in Eq. (2), it must be calibrated on a dataset , which consists of evaluations of an original model with regard to a finite number of realizations , called samples, of in Eq. (1), such that
2.2 Surrogate modelling by PCE
An example of a surrogate model as described in Eq. (3) is provided by a Polynomial Chaos Expansion defined as
Here, are orthonormal polynomials, which act as basis functions of the expansion with truncation order , are normalization factors with inner product , denote standard normal distributed random variables and
is the cumulative distribution function (CDF) with respect to the multivariate random variable. In Eq. (7) and throughout the remaining sections of this paper, is a multi-index over the random input space in Eq. (1)
The calibration of from Eq. (7) to the original model in Eq. (2) is performed by calculating the PC coefficients in Eq. (7.2). To this end, the so called pseudospectral approach to PCE uses a cubature rule for selection of the samples in the dataset in Eq. (6), where the number of samples equals the number of cubature nodes , i.e. , such that
Here denotes the single index form of the cubature nodes. Accordingly with the PC coefficients in Eq. (7.2) are approximated as
where , represents the multi-index form of a cubature rule with dimension and weights , where denotes the number of nodes and weights per dimension . The total number of cubature points needed in Eq. (9) is
as the cubature rule with nodes and corresponding weights must be carried out over all dimensions of the input random vector space in Eq. (1). The relation in Eq. (11) holds true for equal number of nodes and weights per dimension , as it is the case in the reminder of this work. Generally, can be chosen differently for each dimension.
In this paper, the central moments of interest of the input and output are the mean
with expectation defined in Eq. (5)
The mean and the standard deviation can be calculated from the PC coefficients in Eq. (10) by the following relations
2.3 Surrogate modeling by ANN
Deep learning using ANN is a subbranch of machine learning, which is a subbranch of artificial intelligenceGoodfellow et al. (2016); Aggarwal (2018). An ANN, as visualized in Figure 1, is a surrogate model in Eq. (3), which is built of layers consisting of neural units
Here, is the first layer consisting of input samples from a dataset according to Eq. (6), where , denotes single samples of and the number of samples. Furthermore, is the last layer consisting of predictions with respect to the input . For the neural units in Eq. (16), several architectures exist, where the following are used throughout this paper:
Densely connected Feed Forward Neural Network (FFNN)
where are weights for every unit , which are multiplied with the output of the preceding layer . These weights are stored in a matrix for the whole network. After multiplication of weights and output
, a nonlinear activation functionis applied to the product in Eq. (17). This is important for the network to be able to represent non-linearities. In this paper, rectified linear units
(ReLU)Nair and Hinton (2010)) are used
Convolutional Neural Network (CNN) for three dimensional image data utilizing voxels, using a 3D kernel and the convolution operation
where denotes the output voxel of filter , with being the number of filters, which are equivalent to the number of neural units of FFNN. is the three dimensional kernel, which, similar to Eq. (17), can be gathered in a matrix for the whole network and is the kernel dimension.
The so called topology of an ANN is then the composition of neural units in Eq. (16) with their corresponding architectures FFNN in Eq. (17) and CNN in Eq. (19) to a network Sammut and Webb (2011). The calibration of the surrogate in Eq. (16), also called training, consists of comparing the output data of a model , defined in Eq. (2), from a dataset in Eq. (6) consisting of predictions from an ANN in Eq. (16) with weights in Eq. (17) or kernels in Eq. (19). Consequently, the error function in Eq. (4) takes the form
where and are defined means of Eq. (6). This leads to a minimization problem, where an L2-regularization term is added
with regularization factor . The regularization term in Eq. (21) is added to prevent overfitting, where the exact value for the regularization factor needs to be thoroughly tuned. Typically, is used Abadi et al. (2015). The weights or kernels are then updated iteratively by gradient descent of the error function in Eq. (20) with respect to the weights or kernels, respectively
where denotes the gradient descent step width, also called learning rate. For further details the reader is referred to standard works such as Zhang et al. (2019); Goodfellow et al. (2016); Aggarwal (2018).
2.4 Numerical homogenization
where are dimensions and are coordinates inside the unit cell , which consists of a matrix phase and inclusion phases . The uncertain indicator function in Eq. (24) identifies the different phases at different coordinates (see e.g. Clément et al. (2013)). In this work the outer dimensions of the unit cell are deterministic and constant. An uncertain micro elasticity problem is defined as
where is an uncertain micro stress tensor, an uncertain micro elasticity tensor, an uncertain micro strain tensor and an uncertain micro displacement tensor. Here, Eq. (25.1) denotes equilibrium conditions, Eq. (25.2) and Hooke’s law, Eq. (25.3) strain-displacement conditions. For the boundary value problem in Eq. (25) to be well posed, uncertain boundary conditions are introduced in Eq. (25.4), where denotes the boundary of the corresponding unit cell in Eq. (23). Possible choices are Dirichlet, Neumann or periodic boundary conditions.
Accounting also for periodicity, the uncertain micro strain tensor in Eq. (25.3) can be split into an average deterministic macro strain tensor and an -periodic fluctuating uncertain micro strain tensor
where the fluctuating part must be compatible (continuous and single-valued) to an -periodic displacement field. The uncertain micro elasticity tensor in Eq.(25.2) depends on uncertain material parameters of phases in Eq. (24), such that
where is the uncertain first Lamé constant, the uncertain shear modulus, the uncertain Young’s modulus, the uncertain Poisson’s ratio and the uncertain bulk modulus. Solving the uncertain boundary value problem in Eq. (25) on an uncertain microstructure in Eq. (23) and using an average operator on the micro fields, one obtains corresponding effective macro fields
where denotes an uncertain effective macro stress tensor, a deterministic effective macro strain tensor and an uncertain effective macro elasticity tensor. The average operator in Eq. (29) is defined as
The macro and micro stress and strain fields need to satisfy the Hill-Mandel condition:
The model in Eq. (32) can be realized e.g. by an FFT-based homogenization method , based on Vondřejc et al. (2014); de Geus et al. (2017); Zeman et al. (2017). Following Zeman et al. (2017), a brief overview of the FFT-based homogenization scheme is sketched. For detailed explanations, the reader is referred to the above mentioned papers. As a point of departure, the uncertain micro elasticity problem is recast into the weak form using test strains , such that
where the compatibility of the test strains is enforced by means of a convolution with a projection operator to compatible solutions . This projection operator also enforces the zero-mean condition in Eq. (26) and is known analytically in Fourier space. Carrying out discretization by means of trigonometric polynomials and solving integrals by quadrature rules, the following expression in matrix notation holds
The model is then established by solving for the unknown micro strain field in Eq. (34) using strain boundary conditions and projection based iterative methods such as e.g. the conjugate gradient method. After solving Eq. (34), the uncertain effective macro elasticity tensor can be recovered from Eq. (29). By prescribing the uncertain macro strain tensor from Eq. (25), Eq. (32.2) then becomes
Attempts in the literature to use ANN in microstructural homogenization were either restricted to two dimensions, uniaxial strain or fixed material parameters, using FEM instead of FFT. A short overview is given in Table 1.
For complex microstructures, where meshing for FEM becomes expensive, FFT as in Eq. (34) is a mesh free alternative.
It has to be carefully distinguished between an untrained ANN and a trained ANN. The trained ANN is deterministic, as it’s weights are fixed after optimization. For the rest of the paper, the untrained ANN will be denoted by and the trained ANN by .
There are several choices for boundary conditions to solve the boundary value problem in Eq. (25). In this work, the FFT-based Galerkin method of Vondřejc et al. (2014); de Geus et al. (2017); Zeman et al. (2017) is used, which works directly with strains instead of displacements and enforces compatibility of the solution by a projection operator to compatible solutions. Therefore, strain boundary conditions are used.
The following sections propose an algorithm to take these points into account, namely by considering three dimensional, uncertain microstructures with uncertain material parameters and multiple loading directions.
|Yang et al. (2018)||three dimensional||fixed||one direction||fully deterministic|
|Frankel et al. (2019)||two dimensional||fixed||one direction||fully deterministic|
|Beniwal et al. (2019)||two dimensional||fixed||one direction||fully deterministic|
|Rao and Liu (2020)||three dimensional||fixed||multiple directions||uncertain microstructure|
|Present||three dimensional||variable||multiple directions||fully uncertain|
3 A deep learning uncertain FFT algorithm
In a previous work, Caylak et al. (2021) investigate an uncertain numerical homogenization method of long fiber reinforced plastics. Uncertainties of material parameters and geometry are considered and modeled by multivariate random variables. The homogenization is carried out by the finite element method (FEM) utilizing periodic boundary conditions over a meshed representative volume element. To propagate the input uncertainties and calculate effective uncertain properties after homogenization, an intrusive Galerkin projection PCE is used. In this work, an extension towards more complex microstructures is made. In the following, the framework of the proposed method is presented, followed by detailed explanations of the implementation.
3.1 Uncertain homogenization framework
It was shown in e.g. Cruzado et al. (2021) and Vondřejc and de Geus (2020), that FFT based homogenization schemes outperform FEM based full field homogenization techniques with respect to CPU time and memory requirement. FFT based homogenization methods, as described in Eq. (34), are meshless. The discretization of the geometry is carried out on a regular grid utilizing voxels. Homogenization using FFT is described by the model in Eq. (35), which is used for the evaluation in Eq. (32). The goal is to calculate the uncertain effective macro elasticity tensor in Eq. (29). To calculate central moments like mean in Eq. (14) and variance in Eq. (15) of components of the uncertain effective macro elasticity tensor in Eq. (29), UQ is needed. In this work, a PCE according to Eq. (7) as a surrogate model following Eq. (3) is used to approximate Eq. (35)
where is the multi-index in Eq. (8). In order to avoid PC arithmetic like in Caylak et al. (2021), which is needed for the intrusive Galerkin PCE, the pseudospectral approach in Eq. (10) may be used to approximate the PC coefficients in Eq. (36)
Here, and are the nodes and weights of the cubature rule in Eq. (9), which can be gathered in the dataset from Eq. (9) in single index notation. To this end, deterministic solutions of the computational model , as defined in Eq. (11), are needed in Eq. (37)
For large unit cells in Eq. (23), the solution of the problem formulated in Eq. (39) becomes computational challenging due to the large number of evaluations of the deterministic model , which are needed for the PCE as described in Eq. (11). Instead, an ANN as defined in Eq. (16) can be trained to learn the deterministic model , which is faster to evaluate than the original model, because only simple matrix multiplications in FFNN Eq. (17) and CNN Eq. (19) are needed. This approximation can be formulated as
Eq. (37) is replaced by
This is the final formulation for the proposed UQ FFT model using an ANN . Once trained, provides the deterministic solutions for the macro elasticity tensor in Eq. (43). From the surrogate in Eq. (43), central moments, mean Eq. (14) and variance Eq. (15), can be calculated. In the following section, details of the implementation of Eq. (43) are provided.
3.2 Numerical implementation
3.2.1 Algorithm overview
To realize Eq. (43), Algorithm 0 must be implemented. Its task is to generate a training set Eq. (42) for training an ANN to learn FFT homogenization Eq. (35), which is then used for uncertainty quantification of a homogenized elasticity tensor in Eq. (43).
Remark 1: It has to be pointed out, that the ANN is a purely deterministic surrogate to the deterministic FFT solver from Eq. (34). After training, the weights of the ANN in Eq. (17) and Eq. (19) are fixed, thus producing deterministic outputs for given deterministic inputs.
3.2.2 Data generation for deep learning using FFT
where are microstructures as described in Eq. (23), are material parameters defined in Eq. (28), are macro strains from Eq. (29) and denotes the sample space of the training input variables. The total number of samples of the dataset is denoted by , consistent with Eq. (6).
Generally, the only influence on the outcome of
the ANN in Eq. (40)
is the training process including the data provided during
training. The choice of inputs in Eq. (44)
for the training dataset
from Eq. (42) is therefore very important. Clustering around
specific values of e.g. the fiber volume fraction or the material
parameters in the dataset could lead to a bias towards these
values. Having in mind the goal to establish a deterministic surrogate to
replace the FFT solver from Eq. (34), the inputs from
Eq. (44) to the dataset should fill their
sample space from Eq. (44)
uniformly. Therefore, the inputs from Eq. (44) are drawn
are drawn from uniform distributions. Additionally, physically inadmissible values must be avoided. Therefore, the sample space from Eq. (44) needs to be restricted. In implementation practise, these values are drawn from uniform distributions, whereas the lower and upper bounds and , respectively, have to be chosen in accordance with physical restrictions. Therefore the general expression of is specified from Eq. (1) to Eq. (44) as
Here, denote the single material parameters from Eq. (28). The total number of material parameters is denoted by . The upper and lower bounds, and , respectively, are chosen according to the respective admissible range of the material parameter, e.g. for Poisson’s ratio, which is the range for typical engineering materials, including those considered in this work.
The microstructure from Eq. (44) is implemented as a three dimensional voxel array, where the entries are binary, such that is used for matrix and for inclusion material. This is mathematically described by the indicator function in Eq. (24). The microstructure depends on the fiber volume fraction as
where is specific to each microstructure for each in Eq. (44). In this work, two kinds of microstructures are considered, namely single long fiber cylindrical inclusions and multiple spherical inclusions. The latter are generated by the random sequential adsorption method, as described in Bargmann et al. (2018).
Furthermore, shear and bulk modulus, and as described in Eq. (28), respectively, are used as material parameters in the implementation. They are better suited for ANN, because their ranges are of the same magnitude compared to the combination Young’s modulus and Poisson’s ratio in Eq. (28), which leads to smoother gradient updates in the optimization defined in Eq. (22). Other material parameters are converted internally.
The output for the error measure in Eq. (21) is calculated by FFT from Eq. (34). In this work, is the effective macro stress in Eq. (29) corresponding to the macro strain in Eq. (29) and Eq. (44), such that
The effective elasticity tensor in Eq. (29) can be reconstructed using Voigt notation such that
e.g. for The different strain states for Eq. (48) need to be equally represented in the dataset Eq. (38). If one strain state would be underrepresented, the predictive capability for that specific state would be poor. Therefore, the total number of samples must be a multiple of 6, which implies or
where denotes the number of samples with strain state . In practise, a compromise between the number of samples and the training time has to be made. In the examples in Section 4, for Example 1, for Example 2 and for Example 3. The whole process of data generation can be carried out on a workstation with GPU acceleration or on a cluster, because every sample generation is pleasingly parallel.
Remark 2: An alternative approach to Eq. (48) would be to apply all six strain states directly. This however would lead to less flexibility of the obtained ANN. The proposed approach in Eq. (48) using single strain states allows to consider different stochastic properties for different strain states in Algorithm 4. This is the case, if e.g. material parameters are obtained from experimental data, which show different deviations for different loading directions.