This paper proposes and applies the Recursive Projection Method (RPM) – a method to adaptively combine Newton and fixed-point iterations – to the problem of finding the effective mechanical response of a periodic heterogeneous linear elastic solid. Specifically, given the average strain or stress, the goal is to compute the stress and strain distribution within the unit cell. While the linear elastic setting is of interest in itself, it is also used extensively as part of the iterative solution process in nonlinear elastic or inelastic settings.
Early numerical works in the periodic setting exploited Fast Fourier Transforms (FFT)444Following common practice in this field, we denote also the class of algorithms that use Fast Fourier Transforms as FFT methods. to partially simplify the problem; however, due to the heterogeneity of the elastic properties, the problem is not completely decoupled in Fourier space. Therefore, fixed-point methods are used to complete the solution process [moulinec1994fast, moulinec1998numerical, michel1999effective]. This approach was used to compute the effective properties of heterogeneous materials that are linear, or nonlinear through iteration. However, while the method is extremely fast for small elastic contrasts, the convergence is slow or even lost for higher elastic contrasts. This rules out systems such as materials with voids, where the void is treated as a part of the body with zero stiffness, for instance. This led to the formulation of the Accelerated FFT Method in [michel2001computational], based on a scheme proposed in [eyre1999fast] for scalar conductivity problems, that modified the original method to handle high contrast composites. A further refinement, proposed in [monchiet2012polarization], is the Polarization FFT Method, which has been shown to be a special case of a general class of methods in [moulinec2014comparison]. A separate promising approach is [porta2013heterogeneity], based on ideas from energy minimization, though they do not test it on the case of large elastic contrasts.
These methods have been applied in a variety of settings, including in highly nonlinear problems where they are used in each step of the iteration. For instance, in the context of plasticity, we mention application at the macroscale [lebensohn2001n, lebensohn2020spectral], to continuous dislocation models [beyerlein2016understanding, peng20203d], as well as recent seminal work in coupling discrete dislocations with fast calculation of elastic interactions [graham2016fast]. Another important area of application is to phase-field models of microstructure formation in multifunctional materials, e.g. [chen2013phase]. The Fourier methods have also proven powerful in simulating experimental images of microstructure because of the ability to apply the method directly on image data without needing the identification of material boundaries and related challenges [piazolo2015effect].
The methods mentioned above – starting from the original FFT method to the Accelerated and Polarization FFT methods – are based on fixed-point methods or variations thereof; see [moulinec2014comparison] for a summary. Consequently, though some of these methods can converge for large elastic contrasts, the convergence gets slower as the contrast increases. In addition, an important part of these methods is the notion of a homogeneous reference medium (for details, see Section 2). The notion of a reference medium goes back to [eshelby1959elastic], and has proven to be a powerful concept in homogenization analysis [agoras2013iterated, lipton1993inequalities]. However, in the numerical setting, the convergence of the fixed-point methods can be very sensitive to the choice of the reference medium. These reasons motivate the proposed work in applying the RPM to this problem.
A classical result is that the solution of the elasticity problem can be posed as a minimization of the strain energy density under standard assumptions, e.g. [ciarlet1988three]. That is, the balance of linear momentum, represented by where is the stress and is the deformation, is equivalent to , where is the elastic energy density, is the domain, and . For simplicity, we have assumed displacement fixed on the entire boundary. The perspective that the solution of the equilibrium equation is equivalent to the minimization of strain energy leads to useful insights into the solution methods. In particular, consider the case of a heterogeneous elastic material with voids. Typically, the voids are not taken to be part of the body , and instead they are defined through interior surfaces that are traction-free. However, in the FFT method, one cannot easily treat the voids in this manner because the discretization must be uniform for efficiency; hence, the voids are taken to be part of , and treated as regions with zero stiffness. The zero stiffness regions imply that there is infinite elastic contrast, but also loss of uniqueness of the solution.
The uniqueness of solutions is an elementary calculation in linear elasticity, e.g. [ciarlet1988three]. Assuming a boundary value problem with 2 solutions and , it can be shown that the difference satisfies , where
is the linear elastic stiffness tensor, andis the small strain associated with . If is pointwise positive-definite, then , and we have uniqueness up to rigid motions; Dirichlet boundary conditions on a finite part of the boundary fixes the rigid motion.
From a heuristic perspective, the lack of uniqueness of the solution in a material with zero-stiffness regions is reasonable: the fictitious material points in the void region can have arbitrary displacement (while respecting the required smoothness), and this does not cost any elastic energy or affect the physics of the problem. Hence, we can expect soft or flat directions in energy landscape. When we have large elastic contrast due to regions with very low stiffness, the energy will be shallow though perhaps not completely flat. Convergence along these directions can be exceedingly slow, or even lost, in a fixed-point method.
We briefly note that for rigid inclusions, the strain within the rigid objects must vanish, while the stress can be arbitrary as long as smoothness and equilibrium are satisfied. The discussion above can be readily adapted to this setting using the complementary strain energy density.
Given the expectation that the energy landscape is flat or shallow in some directions, the use of Newton or similar iterations is likely to provide convergence within a relatively small number of iterations. However, each Newton iteration is far more expensive than a fixed-point iteration, and increasingly so as the dimension of the solution space increases. The RPM, proposed in [shroff1993stabilization], provides the possibility of a balance between the fixed-point and the Newton methods. The first key aspect of RPM is that it uses the fixed-point iterations to adaptively identify the unstable subspace. That is, starting from the assumption that fixed-point methods will converge when applied to the problem, RPM examines successive outcomes from the fixed-point iteration scheme to discern if, and how, the basis of the unstable subspace should be augmented. The second key aspect of RPM is that as the unstable subspace is built up, Newton iterations are performed on the unstable subspace. This enables a balance between the less expensive but non-converging fixed-point method and the more expensive but better converging Newton method.
We point out that while Newton iterations are performed only on the unstable subspace, the fixed-point iterations are performed on the entire space. The reason to perform fixed-point iterations not only on the stable subspace, but rather on the entire subspace which is slightly more expensive, is that it enables the use of existing fixed-point methods with minimal changes to existing code and algorithm. In fact, we exploit this feature to apply RPM to both the original FFT method [moulinec1998numerical] as well as to the Accelerated FFT method [moulinec2014comparison].
Briefly, the key outcomes of our comparisons of RPM-FFT with the existing methods is that the RPM-FFT method is faster than the other methods for large elastic contrasts. However, an even more important advantage appears to be that the convergence of RPM-FFT is very robust with respect to the choice of the reference homogeneous material, whereas the fixed-point methods are sensitive to this choice; this is particularly important since there are only some heuristic ideas about how to optimally choose the reference material.
For brevity, we refer below to the method proposed in this paper as RPM-FFT; the conventional algorithm from [moulinec1998numerical] as Classical FFT; the method proposed in [michel2001computational] as Accelerated FFT; and the method proposed in [monchiet2012polarization] as Polarization FFT.
2 Classical FFT Method Based on Fixed-Point Iteration
The original FFT method proposed by Moulinec and Suquet [moulinec1998numerical], and some subsequent improvements, are briefly summarized below in the setting of linear elasticity. The goal is to solve the linear elasticity problem on a unit cell in a periodic geometry with a heterogeneous medium. The average symmetric strain tensor is given.
We use respectively to denote the Fourier transform, Fourier space representation, and inverse Fourier transform.
The displacement field is decomposed into a linear part and a zero-mean fluctuating part: , ignoring a constant. This implies the strain decomposition , where is the strain.
We notice that , and while . Therefore, for conciseness after this section, we will use to represent both and , and to represent both and .
The unit cell problem can then be written as
where is the stress field, and is periodic.
We can then introduce, following [monchiet2012polarization, moulinec2014comparison, eyre1999fast] and others, a homogeneous linear elastic comparison medium with stiffness and the polarization . This enables us to rewrite the stress-strain relation as:
The solution to the unit cell problem can be written in terms of the periodic Greens function corresponding to :
The Fourier space representation of this relation is:
Typical FFT methods use the fact that taking the (fast) Fourier transform and then multiplying in Fourier space is much faster than convolution in real-space. However, we note that both the real and Fourier representations in (2.4, 2.5) are implicit, through the dependence of the effective forcing on the solution . Therefore, further work is required to solve this (linear) implicit equation.
From a linear algebra perspective, the Fourier transform takes us to the eigenbasis in a homogeneous problem; in a heterogeneous problem, the Fourier transform brings us close to the eigenbasis, and further calculations are required to completely diagonalize the operator. Roughly, one can make an analogy between the effort to completely diagonalize and the effort to solve the implicit equation in Fourier space.
The implicit equations are typically solved with fixed point methods, and variations of these. In [moulinec1998numerical], for instance, they use the iteration ; see Algorithm 1. Like all fixed point methods, convergence is slow or lost when the problem contains an unstable subspace. This is precisely the case when there is large contrast leading to flat energy landscapes and loss of uniqueness, as has been observed in practice [eyre1999fast].
3 The Proposed RPM-FFT Method
We first outline the key elements of the recursive projection method following the seminal work of Shroff and Keller [shroff1993stabilization], and then the application of RPM to the FFT approach.
When using fixed-point iterative procedures to solve nonlinear problems, slow convergence or even divergence can be caused by a small number of eigenvalues of the Jacobian matrix approaching or leaving the unit disk. The Recursive Projection Method (RPM) is a stabilization procedure that can recover the convergence.
Consider a problem in dimensions posed on . The method begins by dividing the space into , that we denote as the unstable eigenspace, and its orthogonal complement . and
correspond respectively to the eigenspace of those eigenvalues leaving the disk and those inside the disk. Then, the fixed-point iteration is performed only on the complementwhile Newton’s method – or, in practice, a variation such as quasi-Newton – is performed on the unstable eigenspace .
Writing the original problem to be solved in the fixed-point form as follows:
the fixed-point iteration can be written:
where indicates the iteration number. While the eigenvalues of the Jacobian matrix lie inside of the disk
and the initial value is sufficiently close to the solution, the fixed-point iteration will converge. A small positive constant is used to ensure that one is sufficiently far from the boundary between stable and unstable fixed point iterations.
Suppose now that eigenvalues lie outside the disk
Let be the orthonormal basis for . The projectors of the subspaces and are respectively and . For each , there is a unique decomposition:
This allows the introduction of and .
Applying Newton’s method on the unstable subspace :
where is the derivative of with respect to .
We continue to use fixed-point iteration on :
Efficiently identifying the unstable eigenspace is an essential feature of RPM. This corresponds to finding the basis ; or, in practice, an approximation of since the Jacobian matrix can be expensive to compute. In [shroff1993stabilization], this is accomplished directly by monitoring the convergence rate of , without computing the Jacobian matrix. This is an important feature that significantly reduces the computational cost, and we follow that procedure here.
The method begins by assuming that the unstable eigenspace is -dimensional and . We then iteratively build up the unstable eigenspace by adding to the eigenspace corresponding to those eigenvalues that approach the unit disk . The eigenspace that corresponds to the eigenvalues approaching the unit disk corresponds to the dominant eigenspace of .
If the fixed-point iteration does not converge within the specified limit of iterations , the eigenspace corresponding to the eigenvalues approaching the unit disk is identified as follows. In [shroff1993stabilization], it is shown that , i.e., a power iteration with applied to . From the properties of power iterations, identifies the eigenspace of that corresponds to the largest eigenvalue, assuming that has a nonzero component in this space.
Following [shroff1993stabilization], we approximate the dominant eigenspace of using QR factorization to write:
where is orthogonal and is upper triangular. When , we add the first column of to the basis , corresponding to one real eigenvalue approaching the unit disk. Else, we add both columns of to the basis , corresponding to a complex conjugate pair of eigenvalues approaching the unit disk. Alternate approximations are further discussed in [shroff1993stabilization].
To apply the RPM method to the specific context of the FFT fixed-point algorithm, we formally define the fixed-point operator from Algorithm 1 through the equation:
Define to represent in the basis . Then the stable / unstable decomposition can also be written as:
It is suggested in [shroff1993stabilization] that it is often sufficient for convergence to simply compute only once the quantity using finite differences. In our calculations in this paper, we have also found this to be sufficient for convergence, and of course very efficient.
Finally, integrating the FFT method into RPM, gives us Algorithm 2.
An important feature is that the fixed-point iteration is conducted on the entire solution space and not only on the stable subspace. After the fixed-point iteration, only that part of the outcome corresponding to the stable part is retained in constructing the next iterate of the approximate solution. While marginally more computationally expensive than using a fixed-point iteration only on the stable subspace, it has the important advantage that standard existing fixed-point codes can directly be used with minimal modification as the RPM “wraps around” the fixed-point method.
4 A Variational Perspective
We begin with noting an analogy between the problem of interest here and the energy-minimization formulation of electrostatic fields in matter. The field equations and constitutive response of electrostatic fields in matter are respectively:
where is the electric potential, is the dipole per unit volume induced by the electric field , and is the material-dependent response function that can be an explicit function of position. For simplicity, we have assumed Dirichlet boundary conditions on the boundary of the domain , but it is straightforward to use more general BCs. Note that we neglected various constants such as the permittivity of vacuum.
Under some physically-motivated conditions on , this can be reformulated as the minimization of electrostatic energy. For instance, we consider the case of a linear dielectric which is analogous to the linear elastic medium considered in this paper, in which case takes the form:
where is the second-order electrical susceptibility tensor.
Following [james1990frustration] for magnetostatics – see also [shu2001domain] and [yang2011completely] for electrostatics – we can write this as an energy minimization with a nonlocal constraint:
with BCs .
From the fundamental physics of electrostatics problems, related to the motion of charges under an electric field, we have is positive-definite pointwise. Therefore, from an elementary application of the direct method of the calculus of variations, a unique minimizer exists; loosely, the energy is bounded below and grows quadratically in all directions from the positive-definiteness of . The nonlocal constraint, while somewhat nonstandard, can be readily dealt with, e.g. [james1990frustration].
We turn now to linear inhomogeneous elasticity. We recall the balance of linear momentum:
posed on the periodic unit cell with periodic BCs. We decompose , where is the homogeneous reference medium. This gives:
Recall from Section 2 that we defined the polarization field . Following closely the electrostatic case, we can pose this as a nonlocally-constrained critical-point problem. Define the energy :
Use the variation and :
The variations and are not independent; they are constrained to satisfy:
On both sides of the equation above, do the following: (1) take the inner product with ; (2) integrate over ; (3) use integration-by-parts to move the derivatives. The result of these operations is:
where the boundary terms cancel out due to periodicity.
We highlight here that the tensor is generally not positive-definite pointwise, particularly in the important case of high elastic contrasts. For instance, consider a material with voids, where the non-voided material is stable, i.e. is positive-definite in the non-voided material, and in the void region. The reference medium is taken to be stable, i.e. is positive-definite to enable solution of the constraint equation. Then, in the voids, and is not positive-definite pointwise. Therefore, in general, we do not expect existence of minimizers for .
Loosely, the energy has unstable directions which are not bounded below. Therefore, gradient descent methods – which can be written as fixed-point methods as shown below for this energy formulation and by [schneider2017fft] – do not converge. On the other hand, using Newton iterations along the unstable directions – once these are identified – can greatly improve convergence, which is precisely the role of the Recursive Projection Method.
4.1 Fixed-Point Iterations as Gradient Descent
We consider a gradient / steepest descent approach to minimizing the energy :
The gradient direction in function space is given by , and a gradient flow in the standard norm is .
Consider an explicit update scheme where the constraint equation is also updated at each iteration. We obtain the following fixed-point algorithm:
where is related to the fictitious timestep and mobility, and the superscript indexes the iterations. Notice that the constraint update (step 3 above) requires solution of a homogeneous periodic linear elastic problem with a given right side. This can be done very efficiently using fast Fourier transforms. Each iteration in the loop above is therefore extremely quick, but the key question is whether the algorithm converges and, if so, how many iterations it takes. Given that may not possess minimizers, it follows that gradient descent may not converge.
The Classical FFT, summarized in [moulinec2014comparison], can be written in the fixed point form:
This can be considered as derived from the explicit update scheme above, except that the update for in step 2 goes directly to the minimum – for a given – using that the energy is quadratic. As in the explicit update scheme, this is very fast for each iteration, but whether it converges, and the rate of convergence, depend on the structure of the energy landscape.
The Polarization FFT proposed by [monchiet2012polarization] can be rewritten in the form:
While this form is not as transparent or convenient, it enables us to compare the methods in the context of gradient descent. We notice that in this form, the polarization method can be considered as a relaxation method that mixes in the value from the previous iteration, with and being the relaxation parameters.
5 Numerical Comparisons Between Classical FFT, Accelerated FFT, and RPM-FFT
As a canonical problem for many of the calculations below, we follow [moulinec1998numerical] in using a circular stiff fiber embedded in a compliant matrix, at the center of a square unit cell and periodic boundary conditions (Fig. 1). This example is chosen because of the availability of approximate closed-form solutions. We use the following notation: denotes the fiber radius; denotes the size of the square unit cell; and denote respectively the elastic modulus and the Poisson’s ratio of the fiber, noting that we assume isotropic elasticity; and denote respectively the elastic modulus and the Poisson’s ratio of the matrix, again assuming isotropic elasticity; and is the elastic contrast, where particular focus will be on large values of . The average strain is denoted by the components . For all tests in this section except Section 5.5, the reference medium is assumed to be isotropic and we choose the elastic modulus and the Poisson’s ratio .
As a first test, we simply confirm that RPM-FFT converges to the correct solution in Fig. 1. There is no discernible difference555The tolerance in all calculations reported in this paper is taken to be . in the solutions obtained from the Classical FFT and RPM-FFT methods. Further, Classical FFT takes less time666All times reported in this paper are wall-clock times. than RPM-FFT. This is unsurprising given that the elastic contrast is small (). As increases, the advantages of the RPM-FFT method over the Classical FFT method become clear. We examine this and other issues below, with a more detailed reporting of computational time.
5.1 Time to Convergence vs. Elastic Contrast
We next compare the time taken for convergence across the Classical FFT, Accelerated FFT, and RPM-FFT methods. We continue with the circular fiber in matrix setting, with going from to , and . We use a discretization, and apply an average shear strain: , .
Fig. 2 shows the now well-known dramatic improvement between the Classical FFT and the Accelerated FFT methods; [michel2001computational] show that the rate of convergence in the Classical FFT method goes as the contrast , while for the Accelerated FFT method the rate of convergence goes as the square root of the contrast .
Fig. 2 also compares the Accelerated FFT method and RPM-FFT method. When the contrast is below , the Accelerated FFT method converges faster. However, as the contrast increases, RPM-FFT is increasingly competitive, and surpasses Accelerated FFT. The rate of convergence of RPM goes as for this particular class of examples. The exponent appears to be roughly the same in a few other problems that we tested.
We note that the comparisons for all of these methods was conducted by implementing the codes in Matlab with no additional optimization of any of the methods to ensure fair comparisons.
5.2 Balancing Newton Iterations against Fixed Point
As can be expected, the speed of RPM-FFT depends strongly on the size of the unstable subspace on which Newton iterations must be performed. While the Newton’s method has better convergence properties, each Newton iteration is also far more expensive than the fixed point iteration, particularly as the dimension of the unstable subspace increases. On the other hand, the fixed point method converges slowly, or not at all, on the unstable subspace, which also increases the expense as it requires a large number of iterations.
In this section, the term “unstable subspace” refers to the subspace that the RPM-FFT method identifies; this is not necessarily the true unstable subspace whose corresponding eigenvalues are outside the unit disk.
An optimal method has to balance between the opposing issues identified above. Denoting by the number of fixed-point iterations before increasing the size of the unstable subspace, we notice that setting too low causes the dimensionality of the unstable subspace to increase quickly and makes each Newton iteration expensive; on the other hand, if is too high, there will be minimal progress from the large number of fixed-point iterations before switching to Newton iterations.
We examine this interplay numerically for the model problem described previously, for varying . We first fix , and then compare the times obtained for a large range of going from to . We then repeat over a range of . Of these, we denote the value of with the best (least) time as . Fig. 3 shows the times to convergence for a fixed value of , as well as the times to convergence for , as a function of . An immediate conclusion is that there is almost no difference at low , and not much difference at higher .
We next compare the effect of the size of the problem on , for a fixed value of . Fig. 4 shows for different problem sizes, here identified with the number of grid points in each direction. An immediate conclusion is that increases with the problem size.
5.3 Effect of Volume Fraction and Problem Size
In the model problem of a stiff fiber in a soft matrix, the volume fraction of the stiff fiber has been taken to be small; for , the stiff fiber has a volume fraction . Note that we use the term volume fraction though we are working in the 2D setting.
We expect that the size of the unstable subspace is related to the size of the fiber based on the heuristics discussed in Sections 1, 3 and 4. As the volume fraction increases, we therefore expect a a larger unstable subspace and consequently more expensive Newton iterations.
We examine this issue through numerical experiments, and find that RPM-FFT works well even at the combination of large elastic contrast and volume fraction of fiber . For instance, even with or higher, and , RPM-FFT is easily able to converge. Fig. 5 shows the time to convergence as a function of . While larger elastic contrasts require more time to converge – in line with our observations above – we notice that the time to converge is almost insensitive to for volume fractions larger than .
We next consider the case of fixed but with changing problem size with a fixed geometry. That is, we compare discretizations of varying coarseness. The number of grid points within the fiber, as well as overall, increases as the discretization is made finer. We consequently expect the unstable subspace identified by RPM to also increase. Fig. 6 shows the relation between the size of the unstable subspace as the problem size increases from to , for different values of elastic contrast . The volume fraction is held fixed . Roughly, we observe a logarithmic scaling between the size of the unstable subspace and the number of grid points within the fiber.
5.4 Interaction Between Stiff Fibers
Motivated by recent advances in functional composites [Dhriti, roy2017review], we examine the setting of two circular carbon fibers embedded in a polyurethane matrix. It is a significant challenge to characterize the effective properties of heterogeneous materials such as nanofiber- and nanoparticle- reinforced nanocomposites starting from the scale of resolving the entire microstructure. This makes it challenging to develop physics-based high-fidelity model that predict the performance of such nanocomposite systems. The FFT methods can help to make stronger direct connections between mechanics modeling and experimental data, which in turn can have a significant impact on understanding the structure-property relations of these complex heterogeneous nanocomposite systems.
We consider the geometry of Fig. 7 with two fibers that are either close to each other – the separation is smaller than the smaller fiber diameter – and when they are further apart. The moduli are assumed to be GPa for the carbon fiber and GPa for the polyurethane, giving an elastic contrast of . Two fibers, radius and , are located in a rectangular unit cell of size , with . We examine two cases: when the fibers are far apart with distance between the centers of the fibers, and when the fibers are closer with distance , see Fig. 7. We apply a uniaxial average strain: .
Fig. 7 shows strain and strain energy density fields for the two configurations. In all of these configurations, the strain and strain energy density fields are essentially within the stiff fibers, and also show good convergence properties with no obvious spurious oscillations near the transition from matrix to fiber. We notice that in the case where the fibers are further apart, the energy density and strain fields suggest that the fibers can be essentially considered dilute and non-interacting. However, when the fibers are closer together – as is typical of experimental specimens [Dhriti, roy2017review] – there is a clear interaction between the fields, and the energy density and strain fields take their largest values in the region between the fibers.
5.5 RPM Applied to the Accelerated FFT Method
Two important advantages of the RPM approach are, first, that it uses the fixed-point iterations to identify the unstable subspace; and, second, once identified, it performs Newton iterations on the unstable subspace with fixed-point iterations on the entire space. This makes it easy to apply the RPM strategy to existing fixed-point methods with minimal changes to algorithm / code. We therefore exploit this to apply the RPM approach to the Accelerated FFT method which is also based on a fixed-point framework. Given that the Accelerated FFT method has much superior performance compared to the Classical FFT method, we might expect that a RPM-Accelerated FFT method likely performs even better. We examine this issue below.
In [moulinec2014comparison], it is shown that the Accelerated FFT method is a special case of the general polarization-based fixed-point method developed by [monchiet2012polarization]. Algorithm 3 is the general polarization-based method, where and can take any value within a range. For the choice , the polarization-based method reduces to the Accelerated FFT method. In [moulinec2014comparison], they also discuss and compare the Accelerated FFT method with other polarization-based methods, corresponding to the choices and .
All of these methods are fixed-point based methods, and therefore require that the eigenvalues of the Jacobian are within the unit disk for convergence. The Accelerated FFT method has the lowest upper bound [moulinec2014comparison], and we therefore combine this with the RPM method.
In all of these methods, an important part of the method is the additive decomposition of the heterogeneous stiffness into a homogeneous reference medium and a fluctuation; see Section 4. The convergence rate is typically very sensitive to the choice of reference medium.
It has been observed that the choice of the reference medium affects the rate of convergence in many fixed-point based schemes. In the classical FFT method [moulinec1998numerical], the best choice was found to be the average of the supremum and infimum of the elastic moduli. In Accelerated FFT [michel2001computational] and Polarization FFT [monchiet2012polarization] the optimum is shown to be when computing with a very high precision ().
However, as also discussed in Section 4, we expect the RPM-based approach to be insensitive to this choice. Therefore, we apply the RPM approach to the Accelerated FFT method, which is readily obtained by simply replacing the fixed-point operation in algorithm 2 with the iteration procedure in algorithm 3.
Table 1 compares the number of iterations and the time for Accelerated FFT and RPM-Accelerated FFT for a wide range of contrasts. The enhancement provided by RPM is increasingly large as the elastic contrast increases.
|Number of Iterations||Time (seconds)|
We next examine the sensitivity to the choice of reference medium for Accelerated FFT and RPM-Accelerated FFT. Fig. 8 compares the time to convergence and number of iterations – for a wide range of – for the usual Accelerated FFT method against the RPM-Accelerated FFT method, for various choices of reference medium.
When K is small, the performance of the Accelerated FFT follows what has been observed in [moulinec2014comparison]: namely, the best choice is at when . But this is not the case for . Meanwhile, RPM-Accelerated FFT is much less affected by the different choices of the reference over all of the s being tested. As pointed out in [moulinec2014comparison], the choice is valid when the precision is high (), which explains Fig. 8 (e,f,g,h), as all tests here are computed with precision which is more in line with standard practice.
We notice, particularly at high contrast, that the use of RPM in conjunction with Accelerated FFT makes the method much more robust with regard to the choice of reference medium.
6 Concluding Remarks
We have applied RPM [shroff1993stabilization] to the problem of determining the stress and strain in the unit cell of a periodic linear elastic material. Existing methods for this problem, e.g. Classical FFT [moulinec1998numerical] and Accelerated FFT [michel2001computational] methods, are fixed-point methods, and can have difficulty converging when the eigenvalues of the Jacobian lie outside the unit disk. While Newton methods can recover the convergence, they are very expensive on the high-dimensional problems of interest. RPM [shroff1993stabilization] provides an elegant and efficient balance between fixed-point and Newton methods: it uses the fixed-point iterations to adaptively identify the unstable subspace that requires Newton, and performs Newton only on that subspace. Fixed-point iterations can be performed on the complementary stable subspace. For practical reasons of not changing existing algorithms and code, fixed-point is performed on the entire space, but is then projected to the complementary stable subspace.
A variational perspective, using analogies from electromagnetism, provides insight into the reason that the RPM decomposition can work well. In particular, we expect that there are flat or unstable directions in the energy landscape in the context of certain formulations of heterogeneous linear elasticity, particularly when the stiffness tensor can vanish, as in the case of voids in materials. The RPM-FFT method proposed in this paper exploits this structure to perform Newton iterations in directions along which the fixed-point would be unstable.
Our results show that the RPM-FFT is more efficient than the fixed-point methods as the contrast is increased. Further, all of the fixed-point methods mentioned above require the use of a homogeneous reference medium as an important part of the method, but there is typically no systematic approach to selecting the reference medium. We see that while the fixed-point methods can be sensitive to the choice of reference medium, the RPM-FFT method is extremely robust in terms of convergence for a wide range of choices for the reference medium.
We also note the feature of RPM that it can be easily used to “wrap-around” any fixed-point algorithm. Consequently, while we have used it here for two specific instances of FFT-based methods, it can be easily combined with other formulations if they are found to be more efficient.
While we have applied RPM-FFT to the context of mechanical response, our overall approach is applicable to numerous linear problems with the same structure in homogenization [milton2002theory]. Further, while we have not examined nonlinear problems here, it is relatively straightforward to do this. Briefly, most methods for nonlinear problems using the FFT consist of the repeated use of linear solves [lebensohn2020spectral], and the method presented can readily the other approaches in a modular way.
Finally, we note that while we have used Newton iterations for the space that is unstable under fixed-point iterations, it is possible to replace or combine Newton iterations with a conjugate gradient or other solver that could provide better performance [vondvrejc2014fft, gelebart2013non, kabel2014efficient]. Potential improvement can be achieved by combining the model order reduction techniques for FFT solvers as proposed in [kochmann2019simple] with RPM. This provides a direction for future exploration. In this context, we particularly highlight the recent interesting approach proposed in [schneider2017fft], where a different variational principle is shown to provide promising results.
We thank Timothy Breitzman, Anthony Rollett, and Yi-Chung Shu for useful discussions. We acknowledge financial support from ARO Numerical Analysis (W911NF-17-1-0084), ONR Applied and Computational Analysis (N00014-18-1-2528), NSF DMREF Program (1628994), NSF Mechanics of Materials and Structures (1635407), and ARO MURI Program (W911NF-19-1-0245). We acknowledge the National Science Foundation for computing resources through the XSEDE program (TG-DMR120046) provided by Pittsburgh Supercomputing Center. Kaushik Dayal acknowledges an appointment to the National Energy Technology Laboratory Faculty Research Participation Program sponsored by the U.S. Department of Energy and administered by the Oak Ridge Institute for Science and Education.
Availability of Codes
The code developed and used for the calculations in this paper is available at