1 Introduction
Materials with an underlying meso or microstructure, e. g., composites, solid foams, dualphase steels or 3Dprinted structures, enable the targeted design of engineering components with respect to their application. However, due to effects such as anisotropy, nonlinear or multiphysics phenomena, the experimental characterization of the effective constitutive behavior of such materials can be very complex.
1.1 Multiscale schemes
To avoid such a characterization of the material’s effective behavior, computational multiscale schemes can be used. These schemes allow the simulation of engineering components made of materials with underlying microstructure solely based on information about the microstructural arrangement and the properties of the individual components, e. g., matrix and inclusions. Basically, two different types of multiscale schemes exist: the coupled multiscale scheme which is also known as FE approach (finite element square) [Feyel2000, Miehe2002b, Schroder2014, Schroder2016, Keip2017, Lange2021, Koyanagi2021, Yvonnet2019] and the decoupled or sequential multiscale scheme [Yamazaki2018, Terada2013, Kalina2020a, Yamamoto2022, Saito2021, Gebhart2022].
The FE method allows to completely couple the microscopic and macroscopic scales without the need for the explicit formulation of an effective constitutive model. It is thus universally applicable to arbitrary geometries if the necessary microscopic information are available. However, the decisive disadvantage is the very high computational effort, which results from the solution of the microscopic boundary value problem (BVP) for the homogenization at each integration point of the macroscopic FE mesh.
Within a decoupled multiscale scheme, the material’s effective behavior is initially determined from homogenization of representative volume elements (RVEs) and then a suitable constitutive model, the socalled macro or surrogate model, is calibrated by these data. With this model, macroscopic BVPs can now be solved, whereby the influence of the microstructure is implicitly captured by the macro model. Thus, in contrast to the FE method, the explicit solution of the microscopic BVP at each integration point is omitted. The central disadvantage, however, is that the formulation of such a model can be extremely complicated.
1.2 Databased methods in solid mechanics
To circumvent the time consuming task of formulating and calibrating a surrogate model within a decoupled multiscale scheme, databased or datadriven techniques are very promising and have the potential to improve or replace traditional constitutive models. These techniques have become increasingly popular in the computational mechanics community during the last years [Bock2019, Montans2019]. In the following, a brief overview of the most common methods and their application to multiscale schemes is given.
1.2.1 Overview on databased constitutive modeling
A relatively new strategy to substitute classical constitutive equations is the datadriven mechanics approach, initially proposed by Kirchdoerfer and Ortiz [Kirchdoerfer2016] and extended to, e. g., noisy data sets [Kirchdoerfer2017], finite strains [Nguyen2018], inelasticity [Eggersmann2019] or fracture mechanics [Carrara2020] in the meantime. This method completely avoids to use constitutive equations. Instead, sets of stressstrain tuples which characterize the material’s behavior are used. A datadriven solver hence seeks to minimize the distance between the searched solution and the material data set within a proper energy norm, while compatibility and equilibrium have to be satisfied simultaneously.
The construction of so called constitutive manifolds from collected data is an alternative approach which is described for elasticity and inelasticity by Ibañez et al. [Ibanez2018]. With this technique, it is possible to strictly fulfill the 2nd law of thermodynamics, i. e., the thermodynamic consistency, by using the GENERIC paradigm (General Equation for NonEquilibrium ReversibleIrreversible Coupling) during the construction of constitutive manifolds [Gonzalez2019].
Besides the previously mentioned techniques, there exist numerous databased methods originating from the field of machine learning (ML)
. Probably, the most common technique is the application of
artificial neural networks (ANNs), which have already been proposed in the early 90s by the pioneering work of Ghabussi et al. [Ghaboussi1991]. In the last decades, ANNs have been intensively used for mechanical material modeling and simulations by means of the finite element method (FEM), e. g., in [Hashash2004, Zopf2017, Stoffel2018, Tac2021, Kalina2021] among others.However, in general, a large amount of data is required to train ANNs to serve as robust and accurate surrogate models for systems with complex underlying physics, e. g., constitutive models. In this context, a comparatively new branch of ML techniques related to constitutive modeling are approaches classified as
physicsinformed, physicsconstrained [Frankel2020, Fuhg2022], mechanicsinformed [Asad2022] or hybrid models [Settgast2020, Malik2021].^{1}^{1}1In the following, the term physicsinformed does not directly refer to the PINNapproach (physicsinformed neural network) according to Raissi et al. [Raissi2019]. In a PINN, the searched solution field approximated by the ANN, e. g., displacement, is inserted into the governing partial differential equation (PDE) at collocation points. This expression is then added to the loss, so that the fulfillment of the PDE is enforced. In the context of constitutive modeling, however, the idea to enrich the ML approach with physical knowledge is applied in a similar way.
In these methods, essential physical principles and information are inserted into the MLbased model or parts of classical models are replaced with databased methods, which leads to an improvement of the extrapolation capability and enables training with sparse data. In the case of ANNs, this can be achieved via the network architecture or by adapted training algorithms. By choosing problemspecific invariant sets as the input variables, material symmetries are automatically satisfied for hyperelastic models [Linka2021, Kalina2021, Klein2021]. Furthermore, the thermodynamic consistency can be fulfilled by choosing the free energy as the output quantity. In order to train the ANN with respect to the stresses, gradients of the output with respect to the input are inserted into the loss [Fernandez2020a, Linka2021]. This technique is also named as Sobolev training in [Vlassis2020]. Furthermore, physical knowledge can be inserted via constraint training processes [Weber2021]. For the consideration of dissipative behavior, Masi et al. [Masi2021]proposed an adapted ANN architecture consisting of two feed forward neural networks (FNNs). Thereby, the first network is used to predict the evolution of internal variables and the second for approximating the free energy. Finally, the combination of classical models with databased techniques is a further method to insert physical knowledge. Thereby, only parts of a model are replaced by databased techniques, e. g., in plasticity models
[Settgast2020, Malik2021, Vlassis2021].1.2.2 Databased multiscale modeling and simulation
In the context of multiscale problems, the mentioned databased methods can be used as surrogate models which replace the computationally expensive simulation of RVEs [Liu2021]. The high flexibility of trained networks with simultaneously excellent prediction qualities is thereby proven in numerous works: In [Fernandez2020a] ANNs are used to describe the homogenized response of cubic lattice metamaterials exhibiting large deformations and buckling phenomena. Thereby, several types of hyperelastic MLbased models which fulfill basic physical requirements are used and compared to each other. Based on this, an extension to polyconvex hyperelastic models is given in [Klein2021]. In [Li2020a], FFNs and
RNNs (recurrent neural networks)
are used to replicate the homogenization of RVEs revealing inelastic behavior of the individual components. Thereby, also unknown paths can be predicted with the trained networks. Similarly, ANNs that replace the inelastic constitutive responses of composite materials sampled by RVE simulations are shown in [Fuchs2021]. Thereby, in addition, a deep reinforcement learning combinatorics game is used to automatically find an optimal set of network hyperparameters from a decision tree. A datadriven multiscale framework called
deep material network is shown in [Liu2019a, Liu2019, Gajek2020]. Thereby, the homogenized RVE response is reproduced by a network including a collection of connected mechanistic building blocks with analytical homogenization solutions. With that, a complex effective response could be described without the loss of essential physics. An extension of the deep material network approach to fully coupled thermomechanical multiscale simulations of composite materials is given in [Gajek2022].Furthermore, ANNs can be used to link microstructural characteristics and effective properties. For example, in [Fuhg2022a], an ML framework for predicting macroscopic yield as a function of crystallographic texture is described. An MLbased multiscale calibration of constitutive models representing the effective response of rate dependent composite materials is applied to brain white matter in [Field2021]. Based on a library of calibrated parameters corresponding to a set of microstructural characteristics, an ML model which predicts the constitutive model parameters directly from a new microstructure is trained. Similarly, an ANN is trained to predict the elastic properties of short fiber reinforced plastics in [Breuer2021].
Finally, the application of databased techniques as surrogate models in decoupled multiscale schemes, which enable the simulation of macroscopic engineering components, is promising. In the pioneering work [Le2015], a decoupled multiscale scheme using an ANNbased surrogate model has been shown for elastic composites with cubic microstructures. Another multiscale methodology is presented in [Fuhg2021]. Therein, a hybrid macroscopic surrogate model is used, i. e., a traditional constitutive model is combined with a datadriven correction. An MLbased multiscale framework for the simulation of the elastic response of metals having a polycrystalline microstructure is presented in [Vlassis2020]. Therein, the database is generated by using a 3D
FFT (fast Fourier transform)
solver. Based on this, [Vlassis2021b, Vlassis2021] show an extension to elastoplasticity, whereby ANNs are used for the description of the yield surface and the stress within a hybrid modeling approach. Similarly, the simulation of the elasticplastic deformation behavior of opencell foam structures is shown in [Settgast2020, Malik2021]. Thereby, a hybrid model including two ANNs is used as the macroscopic surrogate model. Therein, the first network serves for the description of the macroscale yield function and the second one for the prediction of the floating direction. In [Gajek2022], two scale simulations of thermomechanical problems are solved by using deep material networks. Applying the modeling strategy initially proposed in [Masi2021], a multiscale scheme for the inelastic behavior of lattice material structures is described in [Masi2021a]. An application of ANNs as surrogate models describing the anisotropic electrical response of graphene/polymer nanocomposites is shown in [Lu2019]. Furthermore, in [Wu2020], the application of recurrent neural networks (RNNs) as surrogate models within multiscale simulations of elastoplastic problems is shown. Furthermore, the RNNbased approach is compared to full FE simulations. An application of RNN surrogate models to viscoplasticity is described in [Ghavamian2019]. A combination of fully coupled FE simulations with an adaptive switching to ANNbased surrogate models for the complex simulation of RVEs is shown in [Fritzen2019].In addition, a multiscale framework based on the datadriven mechanics approach [Kirchdoerfer2016] is presented in [Karapiperis2021] for the application to sand. Thereby, the necessary data set is extracted from lower scale simulations. In [Korzeniowski2021], a multilevel method is used within a datadriven multiscale scheme which is applied for the simulation of solid foam materials.
1.3 Content
Within this contribution, an efficient datadriven multiscale approach called FE, which makes use of physicsconstrained ANNs as a macroscopic surrogate model, is presented. The approach allows to consider materials with complex microstructures leading to an overall anisotropic behavior, whereby a restriction to hyperelastic solids is made for now. The ANNbased surrogate model automatically fulfills several physical principles, e. g., objectivity, material symmetry, compatibility with the balance of angular momentum and the thermodynamic consistency. This is done by using a set of problem specific invariants as the input of the network and the Helmholtz free energy density as the output, cf. Linka et al. [Linka2021], Klein et al. [Klein2021] or Linden et al. [Linden2021], among others. The data which are used to train the ANN are collected via computational homogenization of an RVE representing the material’s microstructure. Thereby, in contrast to most of the datadriven multiscale approaches from the literature, it is not necessary to explore all required macroscopic states of deformation which occur in the macroscopic simulation in advance. Instead, the required data, i. e., effective deformations and corresponding stresses, are determined by homogenization in a fully autonomous way within the framework by collecting the relevant deformations from the macroscopic FE simulation. This procedure is similar to the approach presented by Korzeniowski and Weinberg [Korzeniowski2021] which is based on the datadriven mechanics approach [Kirchdoerfer2016]. Here, in addition, the macroscopic deformations are mapped into an invariant space associated to the symmetry group of the considered material. In this space, the selection of relevant states is done so that the number of timeconsuming microscale simulations can be reduced to a minimum. Moreover, it is not necessary to perform a rough scan of the relevant deformation area in advance here. The proposed framework is exemplarily applied to several descriptive examples, where a fiber reinforced composite with a highly nonlinear behavior of the individual components is considered.
The organization of the paper is as follows: In Sect. 2, the basic equations of the finite strain continuum solid mechanics theory as well as basic principles of hyperelastic models are given. After this, the proposed datadriven multiscale framework is described in Sect. 3. This approach is exemplarily applied within several numerical examples in Sect. 4. After a discussion of the results, the paper is closed by concluding remarks and an outlook to necessary future work in Sect. 5.
2 Continuum solid mechanics
In this section, the basic kinematic and stress quantities as well as general relations of anisotropic hyperelastic constitutive models are summarized shortly. The reader is referred to the textbooks of Haupt [Haupt2000], Holzapfel [Holzapfel2000] or Ogden [Ogden1997]
for a detailed overview. Furthermore, a Hilltype homogenization framework is introduced. For a clear mathematical notation, the space of tensors
(1) 
except for a tensor of rank zero, is used in the following. In Eq. (1), , and
denote the Euclidean vector space, the set of natural numbers and the dyadic product, respectively. Tensors of rank one and two are given by boldface italic symbols in the following, i. e.,
or . Furthermore, a single or double contraction of two tensors is given by or , respectively. Thereby, denotes a Cartesian basis vector and the Einstein summation convention is used.2.1 Kinematics and stress measures
2.1.1 Kinematics
In the following, a material body which occupies the reference configuration at time and the current configuration at time is considered. The displacement vector of a material point capturing the positions at and at is given by . Therein, denotes a bijective motion function which is postulated to be continuous in space and time. As further kinematic quantities, the deformation gradient and the Jacobian determinant are defined by the relations
(2) 
In the equation above, is the nablaoperator with respect to reference configuration .
A deformation measure which is free of rigid body motions is given by the positive definite right CauchyGreen deformation tensor with the space of symmetric second order tensors . With that and by using Sylvesters formula, the spectral decomposition of follows to
(3) 
where , and denote the identity tensor, the principal stretches and the projection tensors, respectively. The introduced symbol indicates the algebraic multiplicity of . Furthermore, is defined as the Kronecker delta.
2.1.2 Stress measures
Within the framework of nonlinear continuum solid mechanics, various stress measures can be defined. The symmetric Cauchy stress , also known as true stress, is defined with respect to the current configuration . Furthermore, the 1st and 2nd PiolaKirchhoff stress tensors and follow from the pull back operations
(4) 
Accordingly, is related to both, and , whereas is completely defined with respect to .
2.2 Hyperelasticity
2.2.1 General properties
The constitutive behavior of the considered solids is restricted to hyperelasticity within this work. Accordingly, a hyperelastic potential which is equal to the Helmholtz free energy density function exists. By applying the procedure of Coleman and Noll [Coleman1963], the relations
(5) 
then follow from the evaluation of the ClausiusPlanck inequality [Holzapfel2000]. With that, the thermodynamic consistency of any hyperelastic model is fulfilled a priori.
In addition, there are some further requirements on , which ensure a physically reasonable constitutive behavior [Holzapfel2000, Ogden1997]. The normalization condition requires that , i. e., that the free energy vanishes in the undeformed state. Furthermore, the free energy should increase in any case if deformation appears: . The former two conditions imply that has a global minimum at . Consequently, the undeformed state is stressfree, i. e., holds. Additionally, the growth condition requires that an infinite amount of energy is needed to infinitely expand the volume or compress it to zero: as . Finally, the principle of material objectivity states that the free energy is invariant with respect to a superimposed rigid body motion. This statement is expressed by the relation which holds for all special orthogonal tensors . Accordingly, the principle of material objectivity is automatically fulfilled if the tensor is used as the argument of instead of , i. e., , which is done in the following.
Finally, polyconvexity of the energy , i. e., convexity with respect to , and , is a further condition. This condition implies material stability, i. e., LegendreHadamardEllipticity, see Ebbing [Ebbing2010] for more details. However, polyconvexity is a quite strong requirement on the free energy.
2.2.2 Material symmetry
Besides the previously mentioned requirements, the constitutive equations should also reflect the symmetry of the described material which is expressed by the principle of material symmetry [Haupt2000, Ebbing2010]. According to that, the free energy have to be invariant with respect to all orthogonal transformations belonging to the symmetry group of the underlying material: for all with .
In order to describe anisotropic constitutive behavior, the concept of structural tensors can be used [Haupt2000, Ebbing2010]. Depending on the considered material, these tensors are of order two , four , or even higher. They reflect the material’s anisotropy and are thus invariant with respect to the symmetry transformations, e. g.,
(6) 
The notation means , where the Einstein summation convention is used. If the structural tensors are appended to the list of arguments of , the energy is an isotropic tensor function even if the material is anisotropic which means that
(7) 
holds for all . To abbreviate the notation, the sets and have been used in Eq. (7).
Finally, a set consisting of irreducible scalar valued invariants could be derived by using the CayleyHamilton theorem.^{2}^{2}2
The CayleyHamilton theorem states that a second order tensor fulfills his own eigenvalue equation, e. g.,
, where denote the principal invariants of , cf. Eq. (9). Consequently, any power with as well as the inverse are expressible in terms of , and [Haupt2000]. Consequently, the free energy is expressed by the isotropic tensor function which is invariant with respect to all transformations. By using the latter definition and applying the chain rule, the 2nd PiolaKirchhoff stress
follows to(8) 
where denotes tensor generators corresponding to the invariants [Kalina2021].
2.2.3 Special anisotropy classes
Within this work, two anisotropy classes are considered, isotropic as well as transversely isotropic materials. Corresponding sets of irreducible invariants are given in the following.
A set of irreducible invariants describing an isotropic hyperelastic solid is given by . The used principal invariants are given by
(9) 
whereby the cofactor of is defined by .^{3}^{3}3Note that the more common expression is equivalent to . Both expressions can be transformed into each other by using the CayleyHamilton theorem. Note that is also expressible by the principal stretches which follows from Eq. (3):
(10) 
According to [Haupt2000, Holzapfel2000, Ebbing2010], the structural tensor describing transverse isotropy is given by the second order tensor , whereby with is the fiber direction in the undeformed state. A set of irreducible invariants is thus . Therein, the latter two invariants, which capture the material’s anisotropy, are given by the expressions
(11) 
2.3 Scale transition scheme
In the following, a distinction between two different scales, the micro and the macroscale is made. The former is characterized by a heterogeneous structure consisting of matrix and inhomogeneities of characteristic length , whereas the second considers a macroscopic body of characteristic length and is assumed to be homogeneous. For the lengths introduced, the relation known as scale separation must hold [Schroder2014]. To label macroscopic quantities, they are marked by in the following.
In order to connect microscopic and macroscopic tensor quantities, an appropriate homogenization scheme is needed. Consequently, each macroscopic point gets assigned properties resulting from the behavior of the microstructure. For this purpose, a representative volume element (RVE) of the material is considered on the microscale in the vicinity of , cf. Fig. 1. An effective macroscopic quantity is then identified from the microscopic field distribution within the RVE by the volume average
(12) 
Using Eq. (12), the macroscopic deformation gradient and the 1st PiolaKirchhoff stress are defined by and , respectively [Schroder2014, Schroder2016, Kalina2020a]. Appropriate boundary conditions for the microscopic BVP, which has to be solved before the volume averaging can be performed, are deducible from the equivalence of the macroscopic and the averaged microscopic energies which is also known as the HillMandel condition. For the considered finite strain setting it is given by the following relation [Schroder2014, Schroder2016, Kalina2020a]:
(13) 
where denotes the material time derivative. Regarding purely hyperelastic behavior, the HillMandel condition expresses the equality of the rates of the macroscopic and the averaged microscopic Helmholtz free energies: . Consequently, it holds .
To fulfill Eq. (13), several type of boundary conditions (BCs) can be used, whereby periodic BCs given by the spaces
(14)  
(15) 
are applied within this work [Kalina2020a]. In the equation above, is the nominal stress vector. Furthermore, and denote values on opposing boundaries of the RVE which is supposed to be periodic. The tilde marks the fluctuation part of a microscopic tensor quantity.
3 ANNbased multiscale approach with autonomous data generation
Based on the summarized continuum theory, the following section introduces the proposed datadriven multiscale scheme FE. The general procedure of this framework is basically subdivided into five steps referred as

initial data generation,

training process of the ANN,

macroscopic simulation,

data analysis, as well as

data enrichment.
After the framework is initiated with step (a), the steps (b)–(e) are executed within an overall loop which is given in Algorithm 1. Accordingly, the ANN is trained with the current data set of the iteration , the macroscale problem is solved, unknown macroscopic states of deformation are detected, and the macroscopic stresses corresponding to the previously unknown deformations are generated via computational homogenization. The loop terminates in an iteration , after no further deformation states are found and the relevant space of deformation is thus completely sampled by . In order to prevent that the macroscopic simulation does not reach the final time step – which could be occur due to a failed convergence of the ANN – but no new deformations are detected within , a further inner repeat loop is implemented. Therein, the steps (b)–(d) are performed multiple times, until the final time step is reached or new states of deformation are found. A schematic representation of the framework is given in Fig. 2. The single steps are described in detail in the following.
In order to enable a fully automatized utilization of the framework, a Python wrapper which runs on a high performance cluster (HPC) using the BatchSystem SLURM (Simple Linux Utility for Resource Management) has been implemented. This wrapper processes the loop independently by starting the individual jobs and managing the results.
3.1 Initial data generation (a)
To start with, an initial data set consisting of data tuples is generated. Basically, a low number of tuples is sufficient to initiate the multiscale scheme. Suitable states of deformation are simple load cases as, e. g., uniaxial tension and equibiaxial tension
(16) 
or simple shear
(17) 
where these load cases could be applied on the RVE in different directions. In the equations above, and denote prescribed effective stretches and shears, respectively. The labeling with means that the corresponding coordinate of the effective 1st PiolaKirchhoff stress is prescribed to zero. The stresses belonging to the deformations are calculated from a computational homogenization according to Subsect. 2.3, whereby this is done by using an in house Matlab FE code. The periodic BCs are applied via the master node concept therein [Haasemann2006].
3.2 ANNbased macroscopic surrogate model
Within the datadriven multiscale loop, an appropriate macroscopic surrogate model is needed. To this end, a physicsconstrained ANN model, which a priori fulfills several physical conditions, is used, compare Subsect. 2.2.
Assuming that the macroscopic material symmetry group of the considered composite is known, and thus the structural tensors , corresponding to the material are available, the invariants are determined first. These scalar values are mapped into a normalized domain, i. e., with respect to the training data set, and are arranged in a vector . Thereafter, the free energy is predicted by an FNN with the normalized invariants serving as input values and as output. Restricting the network architecture to only one hidden layer containing neurons, it holds
(18) 
where the monotonously increasing and convex Softplusactivation function [Klein2021]
(19) 
is used.^{4}^{4}4Note that a nonbounded activation function is necessary to fulfill the growth condition, i. e., as . Besides the Softplus activation function, further choices are possible, e. g., ELU or ReLu. However, in contrast to the Softplus function, they are not twice continuously differentiable. The stress prediction is finally done by applying Eq. (8):
(20) 
Thus, the surrogate model automatically fulfills several physical principles: thermodynamic consistency, objectivity and material symmetry. A graphical summary of the described surrogate model structure is given in Fig. 3.
In order to fulfill additional physical conditions, it may be necessary to incorporate further nonindependent invariants into the argument list of . In this case, Eqs. (18) and (20) are to be modified according to
(21) 
and
(22) 
where , is a set containing the indices of the additional invariants . Accordingly, the growth condition can be additionally guaranteed for by including as a further invariant and not only itself, where is defined to be independent, i. e., . Thus, in this case. However, in addition, further constraints have to be satisfied by the weights to ensure that the growth condition is fulfilled. As shown in Appendix A, a possible sufficient condition is given by
(23) 
Therein, the set contains the indices of the hidden layer neurons. In the work [Klein2021], the growth condition is fulfilled in a similar way by adding an additional energy term which is not directly included in .
3.2.1 Training process of the ANN (b)
The ANNbased model is trained with respect to the current data set in each iteration of the multiscale loop, where a random division into training and test data is made.
Within the training process, the weights and bias values , and are then determined. In order to enable an adjustment of the ANN to the stress values, gradients of the output with respect to the input are inserted into the loss
(24) 
which is similar to [Fernandez2020a, Linka2021, Vlassis2020]. The training is done by using the SLSQP optimizer (Sequential Least Squares Programming). Thereby, the ANN is trained several times, where the parameters of the best achieved training state are stored at the end [Kalina2021].^{5}^{5}5
Due to local minima within the loss function, the optimization procedure which is applied here depends on the starting values of the weights and biases. Thus, the network is trained several times to overcome this.
An implementation of the described workflow is realized usingPython, Tensorflow
and SciPy. Within the training the constraint (23) can be switched on optionally.Finally, in order to fulfill the normalization condition, the bias value is chosen such that within the initial (undeformed) state.
3.2.2 Implementation and macroscopic simulation (c)
The model equation (18) has been implemented within the FE toolbox FEniCS [Alnaes2015, Logg2012]. Therein, the stress relation given by Eq. (20) and the material tangent
(25) 
which is required within the solution via a standard NewtonRaphson scheme, are calculated by means of automatic differentiation.
In addition to weights and bias values of the trained ANN, problem specific structural tensors , have to be prescribed within the macroscopic simulation, cf. Subsect. 2.2.2. Note that the preferred directions on the macroscale does not necessarily have to match the ones of the RVE. A conversion which is necessary in this case is done in step (e), cf. Subsect. 3.3.2.
3.3 Autonomous data mining
Besides the physicsconstrained surrogate model, the core feature of the datadriven multiscale framework is the data mining process which is done in a fully automatic manner.
3.3.1 Data analysis (d)
Within each iteration of the overall multiscale loop, the local deformation states of the macroscopic body are collected. To this end, the deformation gradient is stored at each quadrature point of the FE mesh and at each time increment . Consequently, the body’s full state of deformation is characterized by the set which is a subset of .
Now, previously unknown deformations have to be detected, whereby it is meaningful to only consider states providing additional information for the material under observation. Since the intrinsic constitutive behavior of the material lives in the space of invariants , it is useful to perform a transformation into this space at this point [Kalina2021]:
(26) 
whereby , and , are used, respectively. Thus, by making use of Eq. (26), the current data set is compared to within the invariant space. If a state which is contained in is unique within a given tolerance , it is needed to enrich the data set for the next multiscale iteration step . Thereby, the set is searched in an reverse manner, i. e., starting from the last time step . If a unique state is identified in the step , the full time series with is added to .^{6}^{6}6Note that it is useful to save the time series of a new deformation state, i. e., all states proceeding at a fixed quadrature point. This facilitates the application of the macroscopic deformation within the computational homogenization later on. Furthermore, an extension to path dependent materials requires this mandatory. Thus, it is possible that deformation states in the set are multiple with respect to itself and/or . To avoid that this unnecessarily inflates the training process, a further filtering step is performed in the data enrichment step (e).
By using the described technique for the identification of relevant macroscopic deformation states, it is possible to reduce the number of time consuming microscale simulations following in the next step to a minimum.
3.3.2 Data enrichment (e)
After has been determined, it is necessary to identify the stresses belonging to the states . This is done by applying these states within the RVE simulations and calculating the stresses by volume averaging. In order to use only one RVE and anyhow allow different microstructure orientations in the macroscopic body, a transformation of is necessary before. To this end, the relation with , which holds for arbitrary material symmetry groups, is used. Thus, the collected state of deformation from the macroscopic sample is processed by the Euclidean transformation
(27) 
The tuples consisting of applied deformations and corresponding stresses are then collected in the set .
As already mentioned above, it is possible that data tuples in the set are multiple with respect to itself and/or . Thus, a further filtering step is performed, where multiple tuples are sorted out with respect to and within a given tolerance . To this end, a transformation to the invariant space is used again, see Subsect. 3.3.1. Finally, the enriched data set for the next iteration of the multiscale loop follows from , where only contains relevant and unique tuples of the added deformations and corresponding stresses.
4 Examples
In order to demonstrate the ability of the developed datadriven multiscale approach FE described in Sect. 3, it is applied to the solution of three numerical examples within this section. Specifically, three macroscopic structures – a cuboid with a circular hole, a torsional sample and the Cook membrane – are considered. All of them consist of a fiber reinforced composite revealing a highly nonlinear behavior of the individual phases.
4.1 Microscopic properties of the composite
4.1.1 Constitutive behavior
Matrix  

Fibers 
The constitutive behavior of the composite’s individual components, i. e., fibers and matrix, is described by a hyperelastic Ogden model [Ogden1997]. It is given by the free energy density function
(28) 
where, and are model parameters. The symbol denotes the isochoric principal stretches following from the Flory split [Flory1961]:
(29) 
i. e., the multiplicative decomposition of into volumetric and isochoric parts. Note that the introduced material parameters are related to initial shear modulus and Poisson’s ratio via the relations
(30) 
respectively. Furthermore, in order to guarantee a physically meaningful behavior, the parameters and have to be restricted by the following constraints: and [Ogden1997]. The stress of the Ogden model is determined by using Eq. (5) and follows to
(31) 
The calculation of the projection tensors related to the right CauchyGreen deformation tensor is given in Eq. (3).
Within the following numerical examples, the material parameters given in Tab. 1 are chosen. With this choice, a highly nonlinear behavior of the matrix phase is achieved, cf. [Kalina2021]. The parameters of the fibers are chosen in such a way, that a neoHookean model results. Note that the selected parameters are not related to a real material.
4.1.2 Microstructure and RVE definition
Besides the constitutive behavior of the individual components, the geometric arrangement of the individual phases has to be defined. In order to choose a realistic microstructure, it is characterized by a monodisperse stochastic arrangement of fibers with volume fraction here.^{7}^{7}7It should be noted, that it is not sufficient to represent the microstructure of a fiber reinforced composite with an overall transversely isotropic behavior by a hexagonal unit cell if finite strains are considered. This is due to the loss of the material symmetry which results from the deformation of the RVE cell, cf. Appendix B. Furthermore, a minimum distance of with respect to the fiber radius is chosen.
According to the homogenization concept, a suitable RVE which is sufficiently large to capture the essential statistic properties of the microstructure is needed. Herein, this condition is checked by applying the test proposed by Gittman et al. [Gitman2007]. Thereby, randomly generated RVEs with fixed number of inclusions and volume fraction are generated. The effective response of these RVEs is then determined by a computational homogenization, whereby a specific load case is chosen depending on the property of interest, e. g., this could be a stress or a stiffness component. Based on these results, a statistical analysis is done by evaluating the scatter of by means of the quantity
(32) 
If the accuracy of is sufficient, i. e., , the tested sample size is the final RVE size. Otherwise, the sample size have to be increased and the analysis will be repeated.
The described test has been applied to the fiber reinforced material. Thereby, a tolerance of and a number of RVEs for each RVEclass with inclusions have been chosen. As representative load cases, uniaxial tensions according to Eq. (16) are applied into the ,  and the direction, i. e., perpendicular and parallel to the fiber orientation. Thereby, the stress into the tension direction is evaluated according to Eq. (32). The prescribed tolerance is finally reached for the RVEclass with fibers. The determined stressstretch curves and the chosen RVE are depicted in Fig. 4 for the final RVE. Accordingly, a highly nonlinear and anisotropic effective response occurs.
Generation and meshing of the periodic cells have been done by using the python tool gmshModel^{8}^{8}8The python tool gmshModel is freely accessible under https://gmshmodel.readthedocs.io/en/latest/., whereby the placement of the fibers is realized via the random sequential adsorption (RSA) algorithm [Torquato2002]. The RVEs were meshed by tetrahedron elements with 10 nodes. A total of 94,456 elements is reached for the final RVE with 100 fibers.
4.2 Application of the datadriven multiscale framework for the simulation of macroscopic samples
After the definition of the composite’s microscopic properties, the datadriven framework FE is applied for the simulation of several multiscale problems. The macroscopic geometries and applied BCs are depicted in Fig. 5.
In all examples, the following specifications and meta parameters have been used: The effective transversely isotropic behavior, which results from the RVE homogenization, is described by using the set consisting of six invariants, cf. Subsect. 2.2.3. As discussed in Subsect. 3.2, the nonindependent invariant is added to allow the growth condition to be satisfied by construction of the network architecture. As already mentioned, this requires additional constraints for the weights, e. g., Eq. (23). However, training under these constraints results in noticeably worse stress predictions. Thus, in order to achieve maximum prediction quality, the constraints were not considered within the training here. The adapted architecture is nevertheless used to allow a better comparability to the same network architecture trained with the constraint, see the study in Appendix A. The ANNs used as the surrogate model consist of only one hidden layer with 15 neurons which has been shown to be sufficiently accurate, where the networks are trained 25 times in each macroscopic iteration. For comparison see the study given in [Kalina2021].
As shown in Fig. 4(b), the fiber orientation of the RVE is given by , where is the Cartesian base vector. In order to transform deformation states for the general case by using Eq. (27), Rodrigues’ rotation formula
(33) 
is applied to determine . Therein, is the unit normal vector and the angle between the fiber directions.
Finally, the tolerance for the detection of unique deformation states within the invariant space is set to . The tolerance for the filtering step is set to , cf. Subsects. 3.3.1 and 3.3.2.
4.2.1 Cuboid under tension
As a first example, a cuboid with a circular hole is loaded by tension. To this end, a displacement of is prescribed at the top surface, where is the initial length in direction. The fiber orientation is chosen to . The cuboid’s geometric dimensions are specified by . The hole in the center has a radius of .
Multiscale iterations
The single iterations within the loop are described for the multiscale simulation of the cuboid in the following.
Load case  directions  

Uniaxial tension  ,,  
Equibiaxial tension  ,,  
Uniaxial compression  ,,  
Equibiaxial compression  ,,  
Simple shear  ,,,  
,, 
To start the algorithm, the initial data have to be generated first, cf. Subsect. 3.1 step (a). Here, a total number of 18 load cases – six uniaxial tension and compression, six equibiaxial tension and compression as well as six simple shear states – are prescribed to the fiber reinforced RVE. Different loading directions are considered to collect knowledge about the composite’s overall anisotropy. The applied stretches and shears as well as the directions are given in Tab. 2. In order to avoid that data providing the same physical information are contained multiple times, a filtering process is applied in the invariant space .^{9}^{9}9Due to the material symmetry (transverse isotropy), several loadings are equivalent from the point of the material, e. g., uniaxial tension in  or direction. Note that the fiber direction of the RVE is . Within , the collected states cover only a sparse region which is pervaded by a few paths consisting of 193 tuples . This is shown in Fig. 6, where the set is exemplarily visualized in the sectional planes  and . However, due to the physical knowledge which is incorporated into the ANNbased macroscopic surrogate model, this sparse data set is sufficient to initiate the multiscale scheme.
Now, the algorithm enters the multiscale loop with iteration 1 and the initial data set is labeled as