1 Introduction
Heterogeneous materials are encountered to an increasing extent in simulationbased engineering applications involving structural analysis of solids. Applications include, but are not limited to, advanced modular armor protection [qiao2008impact], design of new materials with specialpurpose optimized properties [ma2017towards], and bone fracture risk assessment [hoffler2000heterogeneity]. When the scale of the heterogeneity is such that multiple disparate spatial scales require resolution in order to achieve a reliable prediction, multiscale homogenization can be a viable alternative to prohibitively expensive direct numerical simulation involving a massive monolithic discretization [mei2010homogenization].
In principle, the method of multiscale homogenization assumes that the material configuration is homogeneous at the macroscale level, but heterogeneous at the level of the smallest represented scale. The evaluation of a conventional constitutive response function at each macroscale material point is replaced by the solution of a discrete Boundary Value Problem (BVP) whose computational domain is a socalled microscale Representative Volume Element (RVE) characterizing the microstructure. The complete BVP is formulated with prescribed boundary deformations corresponding to the deformation gradient at the macroscale material point. The sequence of scales is coupled by formalized procedures of localization and homogenization to transmit information between differing scales [smit1998prediction, miehe1999computational, feyel2000fe2, kouznetsova2001approach]. In this work, a particular variant of the method of multiscale homogenization known as “Finite Element squared” or FE is considered as a starting point. As the name implies, in this variant the Finite Element (FE) method is used to discretize both the macroscopic computational domain and the microscale RVE. The repeated solution of the FE BVP at each material point is an acknowledged computational bottleneck, motivating the development of alternative surrogate models to accelerate this task. Projectionbased Model Order Reduction (PMOR) equipped with a secondtier hyperreduction approximation is one established technique for reducing the computational complexity of nonlinear FE models that has recently been demonstrated to be an effective candidate surrogate model in the context of multiscale homogenization [zahr2017multilevel]
. Alternative surrogate model candidates include regressionbased models constructed using various methods including Kriging and artificial neural networks. Such methods can potentially lead to more computationally efficient surrogate model evaluation compared to PMOR; however a large amount of data may be required to train the model when the constitutive response is complex. To accelerate the acquisition of training data, PMOR again presents a useful computational technology. In this work, a multifidelity framework leveraging low, medium and highfidelity models is proposed to achieve a computationally tractable endtoend procedure for nonlinear multiscale simulations. The proposed framework features a novel insitu construction and utilization of an adaptive PMOR to eliminate computation bottlenecks. The usefulness of this novelty is not restricted to the current application of interest (i.e. multiscale modeling); benefits are anticipated in other related processes such as multifidelity optimization. It is therefore presented in what follows as a generalpurpose methodology.
PMOR is typically embedded in an offlineonline computational framework. In the phase of nontimecritical, offline training, information data from a High Dimensional Model (HDM) are collected and used to construct a ReducedOrder Basis (ROB) – for example, using Proper Orthogonal Decomposition (POD) and the method of snapshots [sirovich1987turbulence] – and the corresponding Projectionbased Reduced Order Model (PROM). Then, in the phase of timecritical, online evaluation, an approximate solution is computed using the PROM as a surrogate model for the underlying HDM. This offlineonline decomposition of the computational effort is appropriate in realtime applications that are performed for the exploration of a space of interest associated with some usually but not necessarily physical model parameters, such as in design optimization [legresley2000airfoil, manzoni2012shape], uncertainty quantification [farhat2018stochastic, farhat2018modeling], and repeated analysis [hetmaniuk2012review, amsallem2010towards]. In these cases where the space of one or more physical macroscopic parameters (such as those defining the geometry, boundary/initial conditions, or material properties) is explored, the offline cost is amortized by the sizable number of required analyses associated with the number of parameter points necessary to adequately perform this exploration. Consequently, the overall speedup can still be significant even when the potentially substantial offline cost is accounted for. Here, it is emphasized that in the context of multiscale homogenization, PMOR is computationally feasible even in the nonparametric setting due to the large number of microscale RVE simulations that are required at a single macroscopic parameter point. Hence, a desirable speedup of the multiscale computations can be still achieved by building PROMs for all smaller scales. Furthermore, a nonparametric application calls for an insitu alternative to the conventional offlineonline strategy, where the construction and utilization of the hyperreduced PROM (HPROM) can be performed during the multiscale simulation without requiring any preliminary computational work, using solution snapshots collected during already performed microscale RVE computations.
Several existing PMOR methods have been previously developed and applied to the solution of nonlinear multiscale problems. In [yvonnet2007reduced], a PROM of the microscale problem was constructed based on an á priori training strategy that involved computing solution snapshots using predefined microstrain histories. This approach was subsequently extended in [monteiro2008computational] to formulations with periodic boundary conditions for the microscale model. However, in each of these works the PROMs were not equipped with a hyperreduction method and consequently the resulting speedup factor was limited to about an order of magnitude, even without accounting for the offline costs. More recently, the coauthors of [zahr2017multilevel] introduced a nonlinear PMOR framework incorporating the Energy Conserving Sampling and Weighting (ECSW) hyperreduction method [farhat2014dimensional, farhat2015structure] which significantly reduced the computational cost of multiscale problems characterized by large deformations and material nonlinearity, including when the offline costs were taking into account. Specifically, the computational framework proposed in [zahr2017multilevel] included a twostep training strategy where: a microscale HPROM is first constructed insitu in order to achieve significant speedups even in nonparametric settings; next, a conventional offlineonline training approach is performed to build a parametric macroscale HPROM. A notable feature of this approach is the reduction in the cost of the training by using the insitu microscale HPROM to accelerate macroscale snapshot acquisition. However, a weakness of the proposed framework is its vulnerability to extrapolation if the underlying approximation subspace is not sufficiently explored during the insitu training phase.
The alternative insitu training strategy proposed in this paper avoids extrapolation by incorporating an adaptive HPROM construction based on the concept of a database of local ReducedOrder Bases (ROBs) constructed and updated onthefly. This process treats the components of the deformation gradient as a parameter vector, which facilitates the locating of each microscale BVP within the database in order to assign to it online the mostappropriate local ROB. Accuracy is optimized by collecting new snapshots and other additional information as needed and updating accordingly the local ROBs. This usage of local ROBs enables the efficient solution of difficult problems for which PMOR is not effectively performed by constructing a single global ROB. Note that locality in this paper refers to the parameter space characterized by the deformation gradient, in sharp contrast with previous works where locality was considered with respect to the solution manifold
[amsallem2012nonlinear].The remainder of this paper is organized as follows. Section 2 provides a brief summary of the FE multiscale homogenization method. Section 3 introduces the microscale HPROM and its training strategy. This section also highlights the necessity for ROB adaptation to avoid extrapolation. Section 4 proposes a novel, online, insitu ROB adaptation approach suitable for nonlinear, multifidelity, multiscale, computational homogenization. Section 5 provides an assessment of the overall proposed adaptive PMOR framework using several representative applications, and demonstrates its ability to deliver speedups while compromising neither stability nor accuracy.
2 The FE computational framework
In this section a brief summary of the FE method is provided, focusing on the aspects that are relevant to subsequent developments including notation. For a comprehensive formulation of multiscale homogenization, the interested reader is referred to [zahr2017multilevel] and the references cited therein. The fundamental generalization underlying FE is that the evaluation of stress as a function of strain for a heterogeneous material is governed by the solution of a locally attached boundary value problem characterizing its microstructure. This concept can be applied recursively to materials that exhibit separated scales; in what follows denotes the sequence of scales (or levels), with and designating the macroscale and the finer meso/microscales, respectively. Specifically, at the coarsest scales () each stressstrain relation is associated with a locally attached microstructure, while at the finest scale () the stressstrain relation is defined by a conventional constitutive law.
2.1 Semidiscrete governing equations
In this work, a distinction is made between constrained and unconstrained Degrees Of Freedom (DOFs), the former being those whose values are specified by essential boundary conditions, while the latter are subject to no such restriction. Following the notation of
[zahr2017multilevel], is defined here as a vector over the set of constrained DOFs at level , as the corresponding vector over the set of unconstrained DOFs at level , and as the concatenation of these two vectors. A similar notation is adopted for matrices. It follows that a subscript attached to any vector or matrix designates a macroscopic quantity. Using this notation and assuming that , the semidiscrete form of the equations of motion governing dynamic equilibrium can be written as(1)  
for , where is a FE mass matrix, is a FE vector of internal forces, and are FE vectors of displacements and accelerations, respectively, and is a vector of history variables associated with certain constitutive laws such as elastoplasticity when introduced at the level . For levels , consists of the set of history variables at all material points of the attached microstructure. The governing equations (1) are coupled by the localization and homogenization conditions presented in the next section.
2.2 Scale transmission conditions
Kinematic information is transmitted from coarser to finer scales via prescribed nonhomogeneous displacement boundary conditions applied to the exterior boundary of the attached microstructure. These prescribed values are specified such that the volumetric average of the deformation gradient at the finer scale is identical to the deformation gradient at the material point of the coarser scale to which the microstructure is attached. This is referred to as the localization condition and can be expressed as follows
(2) 
where is the coarsescale deformation gradient, is the identity matrix, and is the matrix whose three columns contain the prescribed ,  and components of displacement. In other words, the displacement vector over the set of constrained DOFs at level , , is obtained – up to a permutation – by stacking the columns of . Similarly, is the matrix whose three columns contain the ,  and coordinates in the reference configuration of the nodes on the exterior boundary of the attached microstructure whose DOFs are constrained. In the particular case of the socalled uniform essential boundary conditions, the values of all DOFs located on the exterior boundary of the attached microstructure are prescribed. Alternatively, in the case of periodic boundary conditions, only a minimal set of corner points have prescribed displacements while the remaining exterior boundary DOFs are constrained using linear multipoint constraint equations to enforce the periodicity of the fluctuations between each pair of opposite faces.
After computing the response of an attached microstructure to the prescribed boundary displacements, information is transmitted back from finer to coarser scales in the form of a homogenized stress tensor defined as the volumetric average of the stress in the attached microstructure. This quantity can be conveniently obtained from the reaction forces associated with the constrained boundary DOFs as follows
where is the coarsescale homogenized first PiolaKirchhoff stress tensor, is the volume of the RVE, and is the matrix whose three columns contain the , , and components of the internal force vector of the nodes on the exterior boundary of the attached microstructure whose DOFs are constrained, i.e. the boundary reaction forces.
In summary, the multiscale computational model distinguishes itself from a conventional FE model in that it involves repeatedly solving one or more microscale BVPs, with one such solution required for each stressstrain evaluation at all scales other than the finest. For a given RVE, the microscale BVPs vary only in the values of the prescribed boundary displacements. Regardless of whether uniform or periodic boundary conditions are adopted, (2) shows that the computational response of the microscale model can be characterized as an implicit function of nine components of the deformation gradient . In Section 4, the set of components of is treated as a parameter vector as in the parametric setting of PMOR. The parameter space , where for twodimensional problems, for threedimensional ones, and , is explored by solving the microscale BVP repeatedly in order to generate snapshots of the microstructural response as required for the construction of a surrogate model.
At each computational timestep of a single multiscale simulation, a total of highdimensional, microscale BVPs are solved at the finest scale, where denotes the number of Gauss points at level . The overall computational cost associated with such a computational complexity can be prohibitive for largescale multiscale models, which motivates PMOR to truncate the computational complexity by reducing the operational dimensionality of the multiscale model. Even in the absence of a physical parametric setting at the macroscale level, computational savings are still feasible if the macroscale model is not reduced, but rather a sequence of HPROMs are constructed at all other scales , . For this setting, the coauthors of [zahr2017multilevel] proposed an insitu training strategy for microscale reduction where only the single multiscale simulation of interest is required to be performed, without any predefined offline computational work. This methodology will be described in the following section, after which a novel contribution will be presented to extend the scope and robustness of the insitu concept without sacrificing its inherent convenience.
3 Projectionbased model order reduction
Here, the application of PMOR to the FE computation framework is reviewed. Specifically, the POD reduction and ECSW hyperreduction methods are used to construct a surrogate model to accelerate the solution of each locally attached microstructural FE model at all fine scales (). Training is performed insitu (or onthefly), pursuant to our objective of enabling a computational technology for multiscale modeling that is tractable, robust, and convenient for simulators accustomed to conventional materials. Although not considered in what follows, PMOR can also be applied to the macroscale BVP (, see [zahr2017multilevel]).
3.1 Microscale reduction framework
At each fine scale , , the number of DOFs of the computational FE model is reduced by searching for the solution of the problem of interest in a lowdimensional subspace – that is,
(3) 
where is a ROB constructed using POD and the method of snapshots [sirovich1987turbulence], is a vector of generalized coordinates, and . The snapshots from which the ROB is constructed correspond to highdimensional solutions of the microstructural FE model that are collected and processed insitu. The term insitu indicates that such solutions are obtained during the same multiscale simulation in which the model reduction framework is utilized, in contrast to onlineoffline treatments.
The dimensionality of the semidiscrete equations governing the microscale BVP (1) is reduced by performing a Galerkin projection – that is, substituting (3) into these equations and premultiplying them by the transpose of . This leads to the microscale PROM
(4) 
where the superscript designates the transpose operation.
Furthermore, a secondtier hyperreduction approximation is also introduced to avoid computations that scale with the size of the HDM at each level . In this work, the ECSW method [farhat2014dimensional, farhat2015structure] is preferred for this purpose due to its structurepreserving property [farhat2015structure] and the simplicity by which its approximation error can be controlled, namely, via a userspecified thresholding parameter. As the name suggests, ECSW samples a subset of the element set of a given FE mesh – that is, – and assigns to each sampled element a positive weight . The internal force term in the microscale PROM (4) is evaluated by simply weighting the contributions of the sampled elements only, i.e.
(5) 
In the above expressions, the superscript designates the restriction of a global vector or matrix to element . The optimal reduced mesh
and associated weights can be estimated using either the NonNegative Least Squares (NNLS) algorithm proposed by Lawson and Hanson
[lawson1995solving] or alternative sparsity promoting L1minimization methods [chapman2017accelerated].In the context of the microscale reduction framework, reduction and hyperreduction are considered not only for the unconstrained part of the internal force vector as shown in (5), but also for both the constrained part of the internal force required for the homogenization condition and the linear constraint equations required for the enforcement of periodic boundary conditions [zahr2017multilevel].
3.2 Insitu training strategy
As already stated in Section 2, the deformation gradient at a Gauss point characterizes the microscale BVP at level , . In what follows, each microscale BVP at level is associated with a parameter vector in the parameter space (also referred to in the remainder of this section as parameter “point” in ), where is obtained by stacking the columns of . This parametric interpretation of the attached microstructure BVPs motivates the adaptive reduction process discussed in Section 4.
At each computational timestep of a multiscale simulation, microscale BVPs are solved at level , . This suggests that sufficient training data can be acquired for the purpose of constructing a PROM at each level , . An insitu microscale training strategy proposed in [zahr2017multilevel] that leverages this observation can be described by the following steps:

Solve at level , , a sufficient number of highdimensional microscale BVPs encountered at the Gauss points of level , and sample the resulting solution snapshots (note that is not necessarily equal to ).

Construct a global ROB .

Construct a reduced mesh .

Exploit the resulting HPROM at level to solve all remaining microscale BVPs at this level.
This nonadaptive, insitu, microscale training strategy, where the choice for is userspecified (and therefore arbitrary), may encounter scenarios involving extrapolation when the remaining microscale BVPs visit an unexplored region of the parameter space . If a queried parameter point falls outside of the subregion defined by the set of sampled points, a good approximate solution of the microscale BVP may not be computable in the existing lowdimensional subspace . However, an adaptive insitu training strategy has the potential to avoid extrapolation by collecting new highdimensional solution snapshots and using them to enrich the subspace of approximation when a scenario that would otherwise involve extrapolation is detected.
Typically, the enrichment of the approximation subspace needed when the explored domain within is enlarged requires the expansion of the dimension of the ROB, leading to a deterioration in the attainable speedup. When the dimension of the expanded ROB exceeds a certain threshold, a method is proposed next to split the global ROB into two or more local ROBs, and subsequently find an approximate solution at each queried parameter point using the most suitable of precomputed and stored local ROBs. A similar approach was proposed in [amsallem2012nonlinear] for constructing local ROBs, where locality was defined however with respect to the solution manifold. This approach is not ideallysuited for use with insitu microscale training as it requires the clustering of solution snapshots, which can be computationally expensive in this context. Thus, this paper proposes an adaptive PMOR strategy with a method of databases containing local ROBs, where locality is defined with respect to the parameter space. The details of this proposal are presented in Section 4.
4 Adaptive insitu sampling and local reducedorder bases
Next, an insitu, adaptive sampling process based on the concept of a database of local ROBs at each level , , is presented. It leverages the accuracy and computational efficiency of the PMOR framework. The detection of extrapolation when solving a queried microscale BVP is performed using a residualbased error indicator. When an update criterion is triggered by this indicator at level , new information is sampled and used to enrich the content of . In order to preserve the computational efficiency of PMOR, a local ROB is split into two smaller ones if its current size exceeds a certain limit.
In principle, each time a local ROB is updated, the corresponding reduced mesh should also be updated – because the training of ECSW depends on the state of the ROB. Therefore, the HPROM should be updated too accordingly. However, efficiently performing such updates is outside the scope of this paper: it is the subject of ongoing research. With multiple local ROBs stored in a database , the mostappropriate local ROB and associated HPROM must be identified to solve a microscale BVP queried at level .
In order to construct such an adaptive PMOR framework, both the microscale solution snapshots and the underlying deformation gradient parameter points are collected. In order to describe this adaptive framework, the following additional nomenclature and notation are introduced:

, denotes a set of deformation gradient parameter points considered among those that have already been sampled at level , and , . It is collected in the matrix

denotes the corresponding snapshot matrix whose th column stores the HDMbased solution of the microscale BVP characterized by .
4.1 Detection of extrapolation and other large errors
When applied to the solution of a microscale BVP queried at an unsampled parameter point , a local, insitutrained, microscale HPROM will perform the equivalent of an extrapolated approximation if the sampling process underlying the training of the corresponding local ROB did not explore a region of containing . This scenario is illustrated in Figure 1, where denotes the solution manifold at level . Hence, if its pitfalls are to be avoided, the potential for this scenario must be detected before the solution of the queried microscale BVP using a mostappropriate local HPROM is computed or accepted.
There are two main approaches for detecting an extrapolation associated with the approximate solution of a queried microscale BVP characterized by : a geometric approach that determines the position of in , relative to the sampled parameter points collected in ; and a residualbased approach that computes the residual obtained when approximating the HDMbased solution of the queried microscale problem using a mostappropriate local HPROM.
The geometric approach is conceptually simple. However, its implementation is not trivial. It can be carried out by computing the hypersphere of smallest radius containing all parameter points collected in , or the minimal hyperpolygon connecting these parameter points, then checking whether lies or not within the hypersphere or hyperpolygon. Either implementation necessitates a fair amount of efficient computational geometry. In either case however, the main drawback of this approach is that if falls within either of the aforementioned target domains, this does not necessarily imply that the approximate solution computed using the mostappropriate local HPROM has an acceptable accuracy. Such information is provided however by the residualbased approach which, unlike the geometric approach, checks for potential extrapolation after rather than before the solution of the queried microscale BVP is computed. Specifically, this approach computes the following relative residual magnitude
(6) 
and adopts it as an error indicator. When it exceeds a prescribed tolerance , it suggests discarding the approximate solution , enriching the mostappropriate local ROB , and repeating this process until . Hence, this alternative approach, which leads to the updating criterion outlined in Algorithm 1
, equally applies for assessing extrapolation and interpolation errors.
4.2 Updating of a local reducedorder basis
Each update of a database starts by solving a new microscale BVP using the corresponding HDM, appending the computed solution to the current content of the relevant snapshot matrix
, performing the Singular Value Decomposition (SVD) of
, and deducing a new ROB using the singular value energy truncation criterion. Because the computational complexity of SVD applied to a rectangular matrix scales quadratically with the number of columns of the matrix, a lowrank SVD update method such as that presented in [zahr2015progressive] is preferred here for compressing the collected snapshots.For a highly nonlinear problem, the SVD and singular value energy truncation criterion configured for accuracy typically lead to a large ROB. Hence, if the size of such a ROB exceeds a specified capacity limit , this ROB is split here into two smaller ROBs, each of which capable of capturing only the local behavior of a nonlinear microstructure. This splitting, which is bound in this context to occur recursively, leads to the concept of a local ROB. This concept is similar to that of local ROBs pioneered in [amsallem2012nonlinear] and equipped with the notion of clustering of solution snapshots. However, it differs from it in that locality refers here to the parameter space , whereas locality in [amsallem2012nonlinear] referred to the solution manifold (denoted here by . Indeed, the data collected in a parameter matrix provides an alternative approach for building local ROBs by a mechanism centered around updating and splitting. Again, the updating criterion, which is proposed in this work for avoiding extrapolation, is described in Algorithm 1. Its basic principle is described in Algorithm 2. The proposed splitting algorithm is described below.
Specifically, provides a mean for splitting a region of the parameter space
rather than a region of the solution space in two parts, using for example Principle Component Analysis (PCA) or POD, both of which build on SVD. PCA is most useful for exploratory data analysis. It is used here to compute the direction of maximum variance
of a set of sampled parameter pointsas the first eigenvector of the covariance of the corresponding parameter matrix
(7) 
where is the columnwise mean of – that is,
(8) 
and therefore represents the algebraic centroid of the sampled parameter points.
Next, the hyperplane
containing the centroid point and orthogonal to is constructed. This hyperplane can be described as(9) 
It naturally splits into two smaller sets, where each partitioned set has a more compact spatial distribution in (see Figure 1(b)).
As illustrated in Figure 2, the splitting of a set of sampled parameter points into two smaller sets, which is equivalent to the partitioning of a matrix into two smaller matrices, implies a splitting of a region of the parameter space . It also leads to the partitioning of the snapshot matrix in two snapshot matrices. Finally, the compression of these two matrices leads in turn to the construction of two new local ROBs. The entire splitting process is summarized in Algorithm 3.
4.3 Approximation using a mostappropriate local ROB
The adaptive, insitu sampling process described so far, which centers around updating and splitting ROBs, eventually leads to the presence of multiple local ROBs in a database . Hence, when a microscale BVP characterized by an unsampled parameter point is queried for fast solution by a local HPROM, the mostappropriate local ROB available in must be identified online. This can be performed as proposed below.
Let denote the set of all local ROBs that are precomputed and stored at a given time in a database . Recall from Section 4 that each HDMbased solution of a microscale BVP is associated with a parameter point of the parameter space that characterizes the BVP. Hence, a split snapshot matrix can be associated on the same basis with a split parameter matrix . Similarly, a local ROB can be associated with the snapshot matrix that was compressed to obtain , and by transitivity, with the split parameter matrix .
Let denote the algebraic centroid of the parameter matrix (defined as in (8)). Following the same reasoning as above, it follows that can be associated with the local ROB . More generally, a bijection can be defined between the set of sampled parameter points and the set of precomputed local ROBs .
Now, given a queried BVP characterized by an unsampled deformation gradient parameter point – or more simply, given a queried but unsampled parameter point – a mostappropriate, precomputed, local ROB – that is, , can be defined as the precomputed local ROB whose corresponding parameter point is the closest to the queried but unsampled parameter point – that is,
where denotes any preferred norm, for example, .
4.4 Summary: insitu adaptive microscale PMOR framework
Algorithm 4 summarizes the insitu, adaptive, PMOR framework proposed for the acceleration of the solution of the microscale BVPs at any level , .
5 Applications
Here, the computational feasibility as well as the performance of the insitu, adaptive, PMOR framework for microscale BVPs are demonstrated for two nonlinear, nonparametric, dynamic, twoscale simulations (). Specifically, this framework is applied to the solution of two academic but nevertheless computationally intensive, nonlinear, twolevel, dynamic response problems. For both applications, all ROBs are constructed using POD based on displacement solution snapshots. Because adaptive hyperreduction is beyond the scope of this paper, all HPROMs are constructed based on a single reduced mesh that is trained and generated using ECSW during the initialization step of Algorithm 4. Given that at the macroscale level neither considered problem is parametric, PMOR is applied in both cases only to the microscale model. This setting has two benefits:

It enables the exclusive focusing on the performance of the proposed insitu, adaptive approach for performing the PMOR of microscale problems.

It corresponds to the worst case scenario of PMOR performance, as any computational overhead incurred by the PMOR process may only be amortized in this case by the same simulation to which the PMOR process is applied. Hence, any reported speedup factor is a genuine speedup factor.
Since both applications considered herein involve twoscale simulations (), the model labels PROM and HPROM refer throughout the remainder of this section to the microscale computational model. On the other hand, the label HDM may refer to either the macroscale or the microscale computational model.
In both applications, the tolerance for the residualbased error indicator is set to , and the capacity limit for the dimension of a local ROB is set to . The accuracy of all PROM and HPROM approximations is evaluated using the following definition of a relative global error
where the subscript designates the , , or direction of a global reference frame, is the vector of displacements at time extracted from the macroscale solution based on HDMs at both the macroscale and microscale levels, is the vector of displacements at time extracted from the macroscale solution but based on a PROM or HPROM at the microscale level, is the error sampling timestep size, denotes the vector of 1 entries, and is the set of timestamps
where is the initial simulated time and is the final simulated time.
All computations reported herein are performed on a Linux cluster with 84 compute nodes, dualsocket Intel Xeon E52650 v2 CPUs (8 cores/socket), 64 GB of DDR3 memory/node, a 1:1 subscribed FDR Infiniband network, and a Lustre shared file system tuned for parallel I/O.
For both applications, all reported PROM and HPROM wallclock timings as well as speedup factors account for all training costs.
5.1 Vibration of a nonlinear solid elastic bar with a microvoid structure
A computational model for a solid bar made of a heterogeneous structure is considered here. At the finest scale, this structure is characterized by microscopic voids and a porosity ratio of (here, the porosity ratio is defined as the ratio of the total volume occupied by the voids, and the total volume occupied by the material and the voids). The bar has a solid square cross section of height and width cm, and a length cm. It is fixed at both ends. A distributed external force generated by a uniform pressure of magnitude MPa is suddenly applied on its upper face and oriented initially in the transverse direction. Due to symmetry, only half of the bar along its length is modeled.
At the microscale level, a FE model of the bar (see Figure 3) is constructed using 648 eightnoded hexahedral elements with three displacement DOFs per node. The dimension of this HDM is (1,962 DOFs). At this level, the material response is assumed to be well represented by a compressible neoHookean (finite strains) hyperelastic constitutive law with Young’s modulus GPa and Poisson’s ratio . Uniform essential boundary conditions are adopted at this level.
At the macroscale level, a FE model of the bar (see Figure 4) is constructed using 640 eightnoded hexahedral elements with three displacement DOFs/node, and a total number of DOFs (). At this level, the material density is kg/m, and temporal discretization is performed using the explicit central difference scheme and the constant timestep s. The response of the bar is computed from s until s.
First, an initial nonlinear PROM is constructed for the microscale computational model using the solution snapshots collected at s during the HDMbased solution of the first microscale BVPs encountered at the Gauss points of the macroscale computational model. Then, this PROM is hyperreduced using ECSW configured with the Lawson and Hanson sparse NNLS algorithm and a relative accuracy threshold , which generates a reduced mesh with 115 elements ().
Several simulations are performed for this problem using: the macroscale and microscale HDMs described above; the proposed insitu, adaptive, microscale PMOR framework, with and without hyperreduction (the reader is reminded that in this work, any reduced mesh generated by ECSWbased hyperreduction is not updated); and the counterpart nonadaptive microscale PMOR framework described in [zahr2017multilevel], with and without hyperreduction.
Table 1(a) reports the speedup factors and relative global errors computed using the error sampling timestep measured for the proposed insitu, adaptive, microscale PMOR with and without hyperreduction. Similarly, Tables 1(b)–1(d) report these performance results for the counterpart insitu, nonadaptive, microscale PMOR framework previously proposed in [zahr2017multilevel]. Since the latter framework depends in this case on the adhoc parameter specifying the number of microscale BVPs to solve in order to train a global microscale PROM and the corresponding HPROM, three different values of this parameter are considered. The first one, , corresponds to 0.2% of the total number of microscale problems solved during the multiscale simulation performed in the timeinterval s using the timestep size s. The second and third values, and , correspond to 0.5% of and 1% of , respectively. From the results reported in this table, the following observations are noteworthy:

In all cases, the HPROM delivers a much better speedup factor than its underlying PROM. This is expected given that the problem considered here is geometrically nonlinear (finite strains).

The speedup factors delivered by all HPROMs are impressive, considering that the macroscale model is not even reduced.

The adaptively computed PROM delivers 6 to 50 times more accurate results than its nonadaptive counterparts trained with any considered value of , and performs about 1.25 times faster.

The adaptively computed HPROM – except for the nonadapted but fixed reduced mesh – delivers 2 to 3 times more accurate results than its nonadaptive counterpart trained with , but is 1.35 times slower. It delivers roughly the same wallclock time performance as its nonadaptive counterpart trained with , and about the same accuracy. On the other hand, it is about 1.2 times faster than its nonadaptive counterpart trained with , while delivering essentially the same accuracy.
It follows that in all considered cases, the adaptive PROM is faster than its nonadaptive counterpart and considerably more accurate. On the other hand, because the reduced mesh resulting from the ECSWbased hyperreduction is fixed in this work, the otherwise adaptive HPROM can be anywhere from slightly slower to slightly faster than its nonadaptive counterpart, and roughly as accurate.
Considering that all aforementioned performance results account for all computational overhead, including the adaptation overhead, and that the proposed insitu, adaptive, microscale PMOR framework is fully automated, these results demonstrate not only the feasibility of this framework but also its superiority over its nonadaptive counterpart. In particular, Figure 5 shows that even when the reduced mesh underlying the microscale computational model is not adapted, the proposed adaptive framework delivers a remarkable accuracy.




5.2 Impact of a nonlinear elastic cylinder with a composite microstructure on a rigid wall
Next, a computational model is considered for a cylindrical projectile with a heterogeneous microscale structure characterized by a stiff, unidirectional fiber embedded in a flexible matrix. The cylinder has a radius cm and a length cm. It has an initial velocity cms in the negative direction, and a frictionless rigid wall boundary condition at . Due to symmetry, only one quarter of the bar along its length is modeled. This problem is a variant of the socalled Taylor impact problem [taylor1948use].
For this problem, a FE microscale computational model is constructed using eightnoded hexahedral elements with three displacement DOFs per node, and a total number of DOFs. Uniform essential boundary conditions are adopted at this level. The microscale material is modeled as follows:

The fiber is represented as a compressible neoHookean (finite strains) material with Young’s modulus dynecm and Poisson’s ratio . It runs parallel to the local axis along the center of the cube where the locally attached microstructure is defined. Its cross section is approximated by a square of edge size equal to onethird of the unit cell width.

The flexible matrix surrounding this fiber is also represented as a compressible neoHookean material (finite strains), but with Young’s modulus dynecm and Poisson’s ratio .
At the macroscale level, a FE model is constructed using eightnoded hexahedral elements with three displacement DOFs/node, and a total number of DOFs (see Figure 6). At this level, the density is g/cm. Temporal discretization is performed using the explicit central difference scheme and the constant timestep s. For this problem, the multiscale simulation is performed from s until s.
As in the previous application, the proposed insitu, adaptive, microscale framework for PMOR is initialized with the construction of a nonlinear PROM based on the first HDMbased solutions of the microscale BVPs encountered at the Gauss points of the macroscale computational model. Then, this PROM is hyperreduced using ECSW configured with the same sparse NNLS algorithm as in the previous application, and the same threshold for relative accuracy . ECSW generates in this case a reduced mesh with 88 elements (). Three multiscale computations are performed using: the aforementioned macroscale and microscale HDMs; and the proposed insitu, adaptive, microscale PMOR framework, with and without hyperreduction. The obtained performance results are reported in Table 2, where the relative global errors are computed with .
Figure 7 displays two sample timehistories computed for a displacement DOF and a velocity DOF at a same node of the macroscale FE model. It shows that both adaptive PROM and HPROM reproduce extremely accurately the predictions obtained using the macroscale and microscale HDMs. This impressive accuracy of the proposed adaptive PMOR framework is confirmed by the quantitative error results reported in Table 2. This table also shows that for the application considered herein, the insitu, adaptive, microscale framework for PMOR with and without hyperreduction accelerates the time to solution of the simulation based on the macroscale and microscale HDMs by a factor greater than 7. Again, this is an excellent performance result considering that it accounts for the computational time elapsed in the processing of the macroscale model, which is neither reduced nor hyperreduced, and accounts for all computational overhead associated with the PMOR process – including adaptation.
Model  Wallclock time (s)  Speedup factor  Error measure  PROM  HPROM 

HDM  N.A.  displacement (%)  
PROM  displacement (%)  
HPROM  velocity (%)  
yvelocity (%) 
6 Conclusions
An insitu, adaptive, microscale Projectionbased Model Order Reduction (PMOR) framework is proposed in this paper for accelerating the simulation of the nonlinear dynamics of multiscale structural and solid mechanics systems. It is designed to avoid by construction the weakness of extrapolation typically found in nonadaptive training methods. This framework is based on the parameterization of the deformation gradient for the purpose of PMOR at all but the coarsest scale, and on the concept of a database of local ReducedOrder Bases (ROBs) where locality is measured in the deformation gradient parameter space. It achieves accuracy by sampling new highdimensional information as needed, updating onthefly a constructed ROB, and approximating the solution along its trajectory using a handbook of mostappropriate local ROBs. It achieves computational efficiency by splitting an updated ROB that becomes larger than desired into two local ones, each capable of capturing only the local behavior of a nonlinear mesostructure or microstructure, and incorporating hyperreduction (albeit without adaptivity). If the macroscale computational model is not a parametric one, it is not reduced. Even in this case – that is, when the macroscale computational model is highdimensional and its computational cost is accounted for – and when all computational overhead associated with performing PMOR and its adaptation are also accounted for, the proposed insitu, adaptive, PMOR framework accelerates threedimensional, nonlinear, dynamic, multiscale simulations of the academic type by roughly an order of magnitude without compromising accuracy. Larger speedup factors can be expected for more realistic, largescale problems and when adaptivity is extended to hyperreduction.
While described in the context of nonlinear, multiscale, dynamic problems in solid mechanics and structural dynamics, the proposed insitu, adaptive, PMOR framework is sufficiently comprehensive to be tailorable to many other applications. Indeed, conventional offline training of ROBs in a predetermined region of a parameter space typically leads to parametric reducedorder models that are bound to perform some form of extrapolation, whenever they are exercised in a region of the parameter space that was not explored during training. The proposed framework is designed specifically to remedy this issue.
acknowledgments
The authors acknowledge partial support by the Air Force Office of Scientific Research under grant FA95501710182, and partial support by a research grant from the King Abdulaziz City for Science and Technology (KACST). This document does not necessarily reflect the position of these institutions, and no official endorsement should be inferred.