A deep learning driven pseudospectral PCE based FFT homogenization algorithm for complex microstructures

by   Alexander Henkes, et al.

This work is directed to uncertainty quantification of homogenized effective properties for composite materials with complex, three dimensional microstructure. The uncertainties arise in the material parameters of the single constituents as well as in the fiber volume fraction. They are taken into account by multivariate random variables. Uncertainty quantification is achieved by an efficient surrogate model based on pseudospectral polynomial chaos expansion and artificial neural networks. An artificial neural network is trained on synthetic binary voxelized unit cells of composite materials with uncertain three dimensional microstructures, uncertain linear elastic material parameters and different loading directions. The prediction goals of the artificial neural network are the corresponding effective components of the elasticity tensor, where the labels for training are generated via a fast Fourier transform based numerical homogenization method. The trained artificial neural network is then used as a deterministic solver for a pseudospectral polynomial chaos expansion based surrogate model to achieve the corresponding statistics of the effective properties. Three numerical examples deal with the comparison of the presented method to the literature as well as the application to different microstructures. It is shown, that the proposed method is able to predict central moments of interest while being magnitudes faster to evaluate than traditional approaches.



There are no comments yet.


page 20

page 23


A neural network approach for uncertainty quantification for time-dependent problems with random parameters

In this work we propose a numerical framework for uncertainty quantifica...

Three-dimensional convolutional neural network (3D-CNN) for heterogeneous material homogenization

Homogenization is a technique commonly used in multiscale computational ...

A statistical framework for generating microstructures of two-phase random materials: application to fatigue analysis

Random microstructures of heterogeneous materials play a crucial role in...

A robust solution of a statistical inverse problem in multiscale computational mechanics using an artificial neural network

This work addresses the inverse identification of apparent elastic prope...

Mini-data-driven Deep Arbitrary Polynomial Chaos Expansion for Uncertainty Quantification

The surrogate model-based uncertainty quantification method has drawn a ...

Data-Driven Microstructure Property Relations

An image based prediction of the effective heat conductivity for highly ...

Uncertainty Quantification in Three Dimensional Natural Convection using Polynomial Chaos Expansion and Deep Neural Networks

This paper analyzes the effects of input uncertainties on the outputs of...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Composite materials consist of multiple constituents on the micro scale. This leads to a heterogeneous microstructure in the sense of geometry as well as material behavior. In this work, composites with two constituents of linear elastic material are discussed. One constituent acts as a matrix, whereas the second is embedded in the former as inclusions. The geometry of the inclusions can either be long fibers, short fibers or particles Aboudi et al. (2012). The microstructure and the material behavior of its constituents determine the overall effective material behavior of the composite on the macro scale. To obtain the effective macro properties, homogenization techniques like analytical mean-field Böhm (2004) or numerical full-field approaches are used, where the later takes the microstructure into account explicitly. For complex microstructures such as short fiber inclusions, analytical mean-field methods perform poorly Müller et al. (2015). Full-field approaches on the other hand can deal with arbitrary microstructures Müller et al. (2015).

Microstructures as well as the material properties of the composite and underlying constituents are subjected to uncertainties, grounded in either intrinsic variety or induced by the manufacturing or measuring methods Kennedy and O’Hagan (2001); Der Kiureghian and Ditlevsen (2009). This is especially true for short fiber reinforced materials, where not only the microstructures and inclusion geometries are very complex, but also the measurement and imaging is challenging and dependent on many factors Hiriyur et al. (2011). Uncertainties on the micro scale lead to uncertain effective properties of the composite at the macro scale, which requires uncertainty quantification (UQ).

This works follows the framework of Sudret et al. (2017), where UQ consists of three steps. First, the sources of uncertainties (e.g. microstructures and material parameters) need to be quantified, e.g. by defining distributions of random variables based on measurements and expert knowledge. Second, an appropriate computational model of the problem studied needs to be defined. At last, the uncertainties are propagated through the model to achieve the distribution of the output, the so called quantity of interest (QoI). Universal techniques for this purpose are Monte Carlo-type methods (MC) including simple Monte Carlo and Latin Hyper-cube sampling Hurtado and Barbat (1998). Although they can be used on existing computational models by simply sampling from the input distribution and repeatedly running it, they suffer from low efficiency due to slow convergence Caflisch and others (1998).

To address this problem, so called surrogate models can be used instead Sudret et al. (2017). These apply some kind of dimension reduction e.g. on the input parameters. Examples are polynomial chaos expansion (PCE, Ghanem and Spanos (2003); Xiu and Karniadakis (2002)), low-rank tensor representations Vondřejc et al. (2020)

, support-vector machines and radial basis functions.

Uncertainties in the context of continuum micromechanics were studied extensively in e.g. Soize (2008)

. Here, a generic meso-scale probability model for a large class of random anisotropic elastic microstructures in the context of tensor-valued random fields is proposed, which is not limited to a specific microstructure defined by its constituents. The stochastic boundary value problem is solved using the stochastic finite element method (FEM). In

Cottereau et al. (2011), a dualistic deterministic / stochastic method is considered, utilizing two models coupled in an Arlequin framework. A framework using bounded random matrices for constitutive modeling is proposed in Noshadravan et al. (2013), which results in a surrogate, which can be calibrated to experimental data. In Guilleminot and Soize (2013), a new class of generalized nonparametric probabilistic models for matrix-valued non-Gaussian random fields is investigated, where the focus is on random field taking values in some subset of the set of real symmetric positive-definite matrices presenting sparsity and invariance with respect to given orthogonal transformations. An approach using stochastic potential in combination with a polynomial chaos representation for nonlinear constitutive equations of materials in the context of random microstrucutral geometry and high dimensional random parameters is given in Clément et al. (2013). PCE was further used in homogenization by Tootkaboni and Graham-Brady (2010) and Caylak et al. (2021). The later proposed an intrusive PCE in combination with FEM based full-field homogenization Miehe and Koch (2002) to determine uncertain effective material properties of transversely linear elastic fiber reinforced composites. The intrusive approach to PCE uses Galerkin projection, where the FEM algorithm needs to be reformulated using PC arithmetic Debusschere et al. (2004).

While in Caylak et al. (2021) cylindrical single fiber inclusions are homogenized, in this work the uncertain homogenization method is adjusted to more complex microstructures. For this it is convenient to replace the FEM homogenization scheme by a fast Fourier transform (FFT) Schneider (2021) based on the FFT Galerkin approach by Moulinec and Suquet (1995); de Geus et al. (2017). This method uses voxels and is more efficient in terms of CPU time and memory requirement than FEM, as shown in Cruzado et al. (2021) and Vondřejc and de Geus (2020). To adapt PCE to FFT, the Galerkin projection is replaced by the pseudospectral PCE Xiu (2007) to avoid PC arithmetics. Pseudospectral PCE uses numerical integration techniques to obtain the PC coefficients. From this surrogate model, the central moments of the QoI can be calculated. The pseudospectral approach is called a non-intrusive method, as only repeated deterministic solutions from the solver are needed instead of using PC arithmetic. Here, the computational bottleneck is the deterministic solver Ghanem and Red-Horse (2017).

To reduce the computational effort of the deterministic solution, data driven machine learning models trained on the deterministic solver can be used to replace the original model

Bock et al. (2019)

. Popular approaches are decision trees, random forest ensembles, radial basis functions and support vector machines. While relatively easy to train, they are limited to either linear approximations or suffer from excessive parameters, especially in the case of large three dimensional image data

Aggarwal (2018)

. An alternative approach is the use of artificial neural networks (ANN). ANN, especially in the context of deep learning, gained much attention, mainly because of its impact in fields like computer vision, speech recognition and autonomous driving

Bock et al. (2019). The universal approximation theorem Hornik et al. (1989) states, that ANN can learn any Borel measurable function if the network has enough units. Additionally, there exist special architectures to effectively handle large image data. There are already several applications of ANN to computational continuum mechanics. For an overview the reader should refer to Bock et al. (2019). ANN were successfully used as surrogate models in the context of MC based uncertainty quantification of elliptic differential equations Tripathy and Bilionis (2018) and MC application on three dimensional homogenization Rao and Liu (2020). While there were some applications to homogenization, these were limited to either two dimensional microstructures Yang et al. (2018), uniaxial strain Beniwal et al. (2019) or fixed material parameters Ye et al. (2019); Frankel et al. (2019).

To the authors knowledge there was no attempt to apply ANN to homogenization of multiple three dimensional microstructures with uncertain material parameters and different loading directions. This work intends to close this gap by establishing a complete homogenization technique based on ANN. Due to the fast evaluation time of trained ANN, this homogenization technique is very efficient, as shown in this work. The ANN can then be used to carry out otherwise expensive stochastic investigations like pseudospectral PCE, where multiple deterministic solutions are necessary. In summary the key objectives and contributions of this work are:

  • Deep Learning homogenization algorithm: An ANN is trained to homogenize uncertain three dimensional complex microstructures with uncertain material parameters subjected to different loading directions.

  • FFT Label Generation: FFT is used to provide the labels needed for training of the ANN. This is done on voxel based three dimensional microstructures without meshing

  • UQ with pseudospectral PCE: A pseudospectral PCE using an ANN trained on FFT is used for UQ of the uncertain effective elasticity tensor of complex microstructures

The rest of the paper is structured as follows. In Section 2 the theoretical base of pseudospectral PCE, FFT based homogenization and ANN is given. Section 3 is concerned with the proposed homogenization model of using ANN in UQ in the context of pseudospectral PCE. Here the general problem formulation, data creation, network topology and training of the ANN as well as the UQ using pseudospectral PCE are discussed. Section 4 consists of three numerical experiments. One aims to compare the proposed method with Caylak et al. (2021), whereas the second is dedicated to a more complex microstructure, showing also the computational efficiency of the ANN accelerated approach to UQ. The third example investigates the abillity of the ANN to generalize to unseen microstructures. The paper closes with a summary and an outlook in Section 5.

2 Preliminaries

This section gives essential prerequisites of random variables, uncertainty quantification, deep learning and numerical homogenization, which are used in the proposed algorithm in Section 3.

2.1 Random variables and uncertainty quantification

Let be a probability space Xiu (2010) with sample space , -algebra , and a probability measure on . A multivariate random input variable is defined by the map


where is an dimensional vector-space of realizations of elementary events by the map . A model then takes multivariate random input variables and maps them to multivariate random variables defined by the relation


where is an dimensional vector-space of realizations of elementary events by the composition function . The goal of uncertainty quantification (UQ) is to characterize the distribution of the output vector for a given input vector in Eq. (1), thus propagating input uncertainties through the model . Often the original model is computationally demanding. Therefore, surrogate models can be designed, which approximate the original model but take less computing time to evaluate, as


where is an dimensional vector-space of realizations of by the composition function . Due to approximations, the datasets and usually do not coincide. This means, that two mappings of the same realization by the original model in Eq. (2) and a surrogate model are not the same. The distance between and defines an error function Grimmett and Stirzaker (2020)


with expectation defined component wise as the integral


over a probability space . For a surrogate model to have a small error with respect to the original model in Eq. (2), it must be calibrated on a dataset , which consists of evaluations of an original model with regard to a finite number of realizations , called samples, of in Eq. (1), such that


2.2 Surrogate modelling by PCE

An example of a surrogate model as described in Eq. (3) is provided by a Polynomial Chaos Expansion defined as


Here, are orthonormal polynomials, which act as basis functions of the expansion with truncation order , are normalization factors with inner product , denote standard normal distributed random variables and

is the cumulative distribution function (CDF) with respect to the multivariate random variable

. In Eq. (7) and throughout the remaining sections of this paper, is a multi-index over the random input space in Eq. (1)


The calibration of from Eq. (7) to the original model in Eq. (2) is performed by calculating the PC coefficients in Eq. (7.2). To this end, the so called pseudospectral approach to PCE uses a cubature rule for selection of the samples in the dataset in Eq. (6), where the number of samples equals the number of cubature nodes , i.e. , such that


Here denotes the single index form of the cubature nodes. Accordingly with the PC coefficients in Eq. (7.2) are approximated as


where , represents the multi-index form of a cubature rule with dimension and weights , where denotes the number of nodes and weights per dimension . The total number of cubature points needed in Eq. (9) is


as the cubature rule with nodes and corresponding weights must be carried out over all dimensions of the input random vector space in Eq. (1). The relation in Eq. (11) holds true for equal number of nodes and weights per dimension , as it is the case in the reminder of this work. Generally, can be chosen differently for each dimension.

In this paper, the central moments of interest of the input and output are the mean


with expectation defined in Eq. (5)

and variance with standard deviation


The mean and the standard deviation can be calculated from the PC coefficients in Eq. (10) by the following relations


For a more detailed treatment, the reader is referred to Xiu (2010); Grimmett and Stirzaker (2020); Rozanov (2013).

2.3 Surrogate modeling by ANN

Figure 1: Schematic structure of an ANN in Eq. (16) with layers and neural units . An ANN with input , where is a numerical vector and consist of 3D voxel data. is first processed by a CNN in Eq. (19) and after concatenation with feed into an FFNN in Eq. (17). The resulting output is . The index , denotes the single samples, where is the number of samples.

Deep learning using ANN is a subbranch of machine learning, which is a subbranch of artificial intelligence

Goodfellow et al. (2016); Aggarwal (2018). An ANN, as visualized in Figure 1, is a surrogate model in Eq. (3), which is built of layers consisting of neural units


Here, is the first layer consisting of input samples from a dataset according to Eq. (6), where , denotes single samples of and the number of samples. Furthermore, is the last layer consisting of predictions with respect to the input . For the neural units in Eq. (16), several architectures exist, where the following are used throughout this paper:

  • Densely connected Feed Forward Neural Network (FFNN)


    where are weights for every unit , which are multiplied with the output of the preceding layer . These weights are stored in a matrix for the whole network. After multiplication of weights and output

    , a nonlinear activation function

    is applied to the product in Eq. (17). This is important for the network to be able to represent non-linearities. In this paper, rectified linear units


    Nair and Hinton (2010)) are used

  • Convolutional Neural Network (CNN) for three dimensional image data utilizing voxels, using a 3D kernel and the convolution operation


    where denotes the output voxel of filter , with being the number of filters, which are equivalent to the number of neural units of FFNN. is the three dimensional kernel, which, similar to Eq. (17), can be gathered in a matrix for the whole network and is the kernel dimension.

The so called topology of an ANN is then the composition of neural units in Eq. (16) with their corresponding architectures FFNN in Eq. (17) and CNN in Eq. (19) to a network Sammut and Webb (2011). The calibration of the surrogate in Eq. (16), also called training, consists of comparing the output data of a model , defined in Eq. (2), from a dataset in Eq. (6) consisting of predictions from an ANN in Eq. (16) with weights in Eq. (17) or kernels in Eq. (19). Consequently, the error function in Eq. (4) takes the form


where and are defined means of Eq. (6). This leads to a minimization problem, where an L2-regularization term is added


with regularization factor . The regularization term in Eq. (21) is added to prevent overfitting, where the exact value for the regularization factor needs to be thoroughly tuned. Typically, is used Abadi et al. (2015). The weights or kernels are then updated iteratively by gradient descent of the error function in Eq. (20) with respect to the weights or kernels, respectively


where denotes the gradient descent step width, also called learning rate. For further details the reader is referred to standard works such as Zhang et al. (2019); Goodfellow et al. (2016); Aggarwal (2018).

2.4 Numerical homogenization

The following section gives an overview of numerical homogenization in micromechanics. A comprehensive treatment of the topic can be found in Böhm (2004); Aboudi et al. (2012); Li and Wang (2008).

Following the notation of Vondřejc et al. (2014); de Geus et al. (2017); Zeman et al. (2017); Schneider (2021), a microstructure is represented as a unit cell


with properties


where are dimensions and are coordinates inside the unit cell , which consists of a matrix phase and inclusion phases . The uncertain indicator function in Eq. (24) identifies the different phases at different coordinates (see e.g. Clément et al. (2013)). In this work the outer dimensions of the unit cell are deterministic and constant. An uncertain micro elasticity problem is defined as


where is an uncertain micro stress tensor, an uncertain micro elasticity tensor, an uncertain micro strain tensor and an uncertain micro displacement tensor. Here, Eq. (25.1) denotes equilibrium conditions, Eq. (25.2) and Hooke’s law, Eq. (25.3) strain-displacement conditions. For the boundary value problem in Eq. (25) to be well posed, uncertain boundary conditions are introduced in Eq. (25.4), where denotes the boundary of the corresponding unit cell in Eq. (23). Possible choices are Dirichlet, Neumann or periodic boundary conditions.

Accounting also for periodicity, the uncertain micro strain tensor in Eq. (25.3) can be split into an average deterministic macro strain tensor and an -periodic fluctuating uncertain micro strain tensor


where the fluctuating part must be compatible (continuous and single-valued) to an -periodic displacement field. The uncertain micro elasticity tensor in Eq.(25.2) depends on uncertain material parameters of phases in Eq. (24), such that




where is the uncertain first Lamé constant, the uncertain shear modulus, the uncertain Young’s modulus, the uncertain Poisson’s ratio and the uncertain bulk modulus. Solving the uncertain boundary value problem in Eq. (25) on an uncertain microstructure in Eq. (23) and using an average operator on the micro fields, one obtains corresponding effective macro fields


where denotes an uncertain effective macro stress tensor, a deterministic effective macro strain tensor and an uncertain effective macro elasticity tensor. The average operator in Eq. (29) is defined as


The macro and micro stress and strain fields need to satisfy the Hill-Mandel condition:


Numerical homogenization can be represented by a model in Eq. (2) with random input variable in Eq. (1) and random output variable in Eq. (2) defined as


The model in Eq. (32) can be realized e.g. by an FFT-based homogenization method , based on Vondřejc et al. (2014); de Geus et al. (2017); Zeman et al. (2017). Following Zeman et al. (2017), a brief overview of the FFT-based homogenization scheme is sketched. For detailed explanations, the reader is referred to the above mentioned papers. As a point of departure, the uncertain micro elasticity problem is recast into the weak form using test strains , such that


where the compatibility of the test strains is enforced by means of a convolution with a projection operator to compatible solutions . This projection operator also enforces the zero-mean condition in Eq. (26) and is known analytically in Fourier space. Carrying out discretization by means of trigonometric polynomials and solving integrals by quadrature rules, the following expression in matrix notation holds


The model is then established by solving for the unknown micro strain field in Eq. (34) using strain boundary conditions and projection based iterative methods such as e.g. the conjugate gradient method. After solving Eq. (34), the uncertain effective macro elasticity tensor can be recovered from Eq. (29). By prescribing the uncertain macro strain tensor from Eq. (25), Eq. (32.2) then becomes



  1. Attempts in the literature to use ANN in microstructural homogenization were either restricted to two dimensions, uniaxial strain or fixed material parameters, using FEM instead of FFT. A short overview is given in Table 1.

  2. For uncertain homogenization calculating the full elasticity tensor in Eq. (27), three dimensional microstructures in Eq. (23), multiaxial strain in Eq. (29) and varying material parameters in Eq. (28) are needed.

  3. For complex microstructures, where meshing for FEM becomes expensive, FFT as in Eq. (34) is a mesh free alternative.

  4. It has to be carefully distinguished between an untrained ANN and a trained ANN. The trained ANN is deterministic, as it’s weights are fixed after optimization. For the rest of the paper, the untrained ANN will be denoted by and the trained ANN by .

  5. There are several choices for boundary conditions to solve the boundary value problem in Eq. (25). In this work, the FFT-based Galerkin method of Vondřejc et al. (2014); de Geus et al. (2017); Zeman et al. (2017) is used, which works directly with strains instead of displacements and enforces compatibility of the solution by a projection operator to compatible solutions. Therefore, strain boundary conditions are used.

  6. The following sections propose an algorithm to take these points into account, namely by considering three dimensional, uncertain microstructures with uncertain material parameters and multiple loading directions.

Author Microstructures Material parameters Loading Uncertainty
Yang et al. (2018) three dimensional fixed one direction fully deterministic
Frankel et al. (2019) two dimensional fixed one direction fully deterministic
Beniwal et al. (2019) two dimensional fixed one direction fully deterministic
Rao and Liu (2020) three dimensional fixed multiple directions uncertain microstructure
Present three dimensional variable multiple directions fully uncertain
Table 1: Overview of similar homogenization methods using ANN from selected authors.

3 A deep learning uncertain FFT algorithm

In a previous work, Caylak et al. (2021) investigate an uncertain numerical homogenization method of long fiber reinforced plastics. Uncertainties of material parameters and geometry are considered and modeled by multivariate random variables. The homogenization is carried out by the finite element method (FEM) utilizing periodic boundary conditions over a meshed representative volume element. To propagate the input uncertainties and calculate effective uncertain properties after homogenization, an intrusive Galerkin projection PCE is used. In this work, an extension towards more complex microstructures is made. In the following, the framework of the proposed method is presented, followed by detailed explanations of the implementation.

3.1 Uncertain homogenization framework

It was shown in e.g. Cruzado et al. (2021) and Vondřejc and de Geus (2020), that FFT based homogenization schemes outperform FEM based full field homogenization techniques with respect to CPU time and memory requirement. FFT based homogenization methods, as described in Eq. (34), are meshless. The discretization of the geometry is carried out on a regular grid utilizing voxels. Homogenization using FFT is described by the model in Eq. (35), which is used for the evaluation in Eq. (32). The goal is to calculate the uncertain effective macro elasticity tensor in Eq. (29). To calculate central moments like mean in Eq. (14) and variance in Eq. (15) of components of the uncertain effective macro elasticity tensor in Eq. (29), UQ is needed. In this work, a PCE according to Eq. (7) as a surrogate model following Eq. (3) is used to approximate Eq. (35)


where is the multi-index in Eq. (8). In order to avoid PC arithmetic like in Caylak et al. (2021), which is needed for the intrusive Galerkin PCE, the pseudospectral approach in Eq. (10) may be used to approximate the PC coefficients in Eq. (36)


Here, and are the nodes and weights of the cubature rule in Eq. (9), which can be gathered in the dataset from Eq. (9) in single index notation. To this end, deterministic solutions of the computational model , as defined in Eq. (11), are needed in Eq. (37)


In Eq. (38), are multivariate realizations of the random input variable in Eq. (32). Finally the pseudospectral PCE follows from Eq. (36) and Eq. (37) as


For large unit cells in Eq. (23), the solution of the problem formulated in Eq. (39) becomes computational challenging due to the large number of evaluations of the deterministic model , which are needed for the PCE as described in Eq. (11). Instead, an ANN as defined in Eq. (16) can be trained to learn the deterministic model , which is faster to evaluate than the original model, because only simple matrix multiplications in FFNN Eq. (17) and CNN Eq. (19) are needed. This approximation can be formulated as


Eq. (37) is replaced by


For the approximation Eq. (40) to hold, an ANN needs to be trained on a dataset in Eq. (6) defined as


where need to sample the physically admissible support from in Eq. (32). Finally Eq. (39) is approximated as


This is the final formulation for the proposed UQ FFT model using an ANN . Once trained, provides the deterministic solutions for the macro elasticity tensor in Eq. (43). From the surrogate in Eq. (43), central moments, mean Eq. (14) and variance Eq. (15), can be calculated. In the following section, details of the implementation of Eq. (43) are provided.

3.2 Numerical implementation

3.2.1 Algorithm overview

To realize Eq. (43), Algorithm 0 must be implemented. Its task is to generate a training set Eq. (42) for training an ANN to learn FFT homogenization Eq. (35), which is then used for uncertainty quantification of a homogenized elasticity tensor in Eq. (43).

Remark 1: It has to be pointed out, that the ANN is a purely deterministic surrogate to the deterministic FFT solver from Eq. (34). After training, the weights of the ANN in Eq. (17) and Eq. (19) are fixed, thus producing deterministic outputs for given deterministic inputs.

The single steps, namely data generation in Section 3.2.2, ANN training in Section 3.2.3 and UQ in Section 3.2.4, are further described in detail.

  Algorithm 1: Data Generation for Deep Learning using FFT
     Input: in Eq. (44)
    Homogenization: in Eq. (47)
    Output: training dataset in Eq. (42)
  Algorithm 2: ANN Design and Training
    Input: in Eq. (42) and Eq. (16)
    Training: in Eq. (21)
    Output: in Eq. (40)
  Algorithm 3: UQ using PCE and ANN trained on FFT
    Input: in Eq. (40) and Eq. (32)
    Output: in Eq. (43)
Algorithm 0 Overall procedure for UQ with PCE and ANN

3.2.2 Data generation for deep learning using FFT

In step 1 of Algorithm 3.2.1, a dataset from Eq. (42) is created with Algorithm 1. The inputs to the dataset are realizations of an input random variable denoted by from Eq. (1), such that


where are microstructures as described in Eq. (23), are material parameters defined in Eq. (28), are macro strains from Eq. (29) and denotes the sample space of the training input variables. The total number of samples of the dataset is denoted by , consistent with Eq. (6).

Generally, the only influence on the outcome of the ANN in Eq. (40) is the training process including the data provided during training. The choice of inputs in Eq. (44) for the training dataset from Eq. (42) is therefore very important. Clustering around specific values of e.g. the fiber volume fraction or the material parameters in the dataset could lead to a bias towards these values. Having in mind the goal to establish a deterministic surrogate to replace the FFT solver from Eq. (34), the inputs from Eq. (44) to the dataset should fill their sample space from Eq. (44) uniformly. Therefore, the inputs from Eq. (44)

are drawn from uniform distributions

. Additionally, physically inadmissible values must be avoided. Therefore, the sample space from Eq. (44) needs to be restricted. In implementation practise, these values are drawn from uniform distributions, whereas the lower and upper bounds and , respectively, have to be chosen in accordance with physical restrictions. Therefore the general expression of is specified from Eq. (1) to Eq. (44) as


Here, denote the single material parameters from Eq. (28). The total number of material parameters is denoted by . The upper and lower bounds, and , respectively, are chosen according to the respective admissible range of the material parameter, e.g. for Poisson’s ratio, which is the range for typical engineering materials, including those considered in this work.

The microstructure from Eq. (44) is implemented as a three dimensional voxel array, where the entries are binary, such that is used for matrix and for inclusion material. This is mathematically described by the indicator function in Eq. (24). The microstructure depends on the fiber volume fraction as


where is specific to each microstructure for each in Eq. (44). In this work, two kinds of microstructures are considered, namely single long fiber cylindrical inclusions and multiple spherical inclusions. The latter are generated by the random sequential adsorption method, as described in Bargmann et al. (2018).

Furthermore, shear and bulk modulus, and as described in Eq. (28), respectively, are used as material parameters in the implementation. They are better suited for ANN, because their ranges are of the same magnitude compared to the combination Young’s modulus and Poisson’s ratio in Eq. (28), which leads to smoother gradient updates in the optimization defined in Eq. (22). Other material parameters are converted internally.

The output for the error measure in Eq. (21) is calculated by FFT from Eq. (34). In this work, is the effective macro stress in Eq. (29) corresponding to the macro strain in Eq. (29) and Eq. (44), such that


The effective elasticity tensor in Eq. (29) can be reconstructed using Voigt notation such that


e.g. for The different strain states for Eq. (48) need to be equally represented in the dataset Eq. (38). If one strain state would be underrepresented, the predictive capability for that specific state would be poor. Therefore, the total number of samples must be a multiple of 6, which implies or


where denotes the number of samples with strain state . In practise, a compromise between the number of samples and the training time has to be made. In the examples in Section 4, for Example 1, for Example 2 and for Example 3. The whole process of data generation can be carried out on a workstation with GPU acceleration or on a cluster, because every sample generation is pleasingly parallel.

Remark 2: An alternative approach to Eq. (48) would be to apply all six strain states directly. This however would lead to less flexibility of the obtained ANN. The proposed approach in Eq. (48) using single strain states allows to consider different stochastic properties for different strain states in Algorithm 4. This is the case, if e.g. material parameters are obtained from experimental data, which show different deviations for different loading directions.

   Input: in Eq. (44)