1. Introduction
The equations of phasefield models of brittle fracture present a number of challenges to the designers of numerical solution algorithms [1]
. Even in the smallstrain case the equations are nonlinear, due to the multiplicative coupling of the mechanical stresses to the degradation function. At the same time the nonhealing property introduces an inequality constraint. Finally, eigenvaluebased splittings of the energy density as in
[21] make the equations nondifferentiable.In this paper we focus on the spatial problems of smallstrain brittlefracture phasefield models obtained by a suitable timediscretization. The standard approach to solving these spatial problems is based on operator splitting. Algorithms based on this approach, also known as staggered schemes, alternate between solving a displacement problem with fixed damage and a damage problem with fixed displacement. Both subproblems are elliptic and wellunderstood, and such methods are therefore straightforward to implement. The method can be interpreted as a nonlinear Gauß–Seidel method [8], which provides a natural framework for convergence proofs. Applications of the operator splitting scheme and its extensions appear, e.g., in [4, 6, 20].
In contrast, other works propose monolithic solution schemes based on Newton’s method [9, 8, 32, 31]. For the unmodified Newton method only local convergence can be shown, and failure to converge for large load steps is readily observed in practice [34]. Therefore, various authors have proposed extensions or modifications of the Newton idea to stabilize the method. In [9] a line search strategy is applied to enlarge the domain of convergence of Newton’s method. Wu et al. [34] propose to use the BFGS quasiNewton algorithm, claiming that it is more stable than Newton’s method and more efficient than operator splitting. Heister et al. [16] proposed a modified Newton scheme which was later improved by Wick [32] with an adaptive transition from Newton’s method to the modified Newton scheme. Finally, in [27, 19], the authors suggest an arclength method based on the fracture surface, and an adaptive time stepping scheme to enhance the robustness.
In summary, while monolithic Newtontype methods are reported to be faster than operatorsplitting algorithms, the latter ones are more robust [1, 34].
Various approaches are used by these methods to deal with the damage irreversibility. A natural approach is to regularize the constraint, as investigated in [21, 30]. This leads to an additional parameter, and to illconditioned tangent matrices [10]. A second approach considers the thermodynamic driving force of the fracture phasefield as a global unknown yielding a threefield formulation which results in a saddlepoint principle [21]. In a third formulation one considers the Karush–Kuhn–Tucker conditions and shifts the thermodynamic driving force of the fracture phasefield into a local history field representing the maximum over time of the elastic energy [20]. The nondifferentiable maximum function is then discretized explicitly in an otherwise implicit time discretization (Section Appendix: Reformulation with a history field). Unfortunately, this approach spoils the variational structure of the spatial problems.
Augmented Lagrangian solvers as in [30] introduce extra variables. Closest in spirit to the present manuscript is the use of boundconstrained optimization solvers, used, e.g., in [2, 8, 33]. None of these approaches are fully satisfactory.
The effect of the nondifferentiable terms caused by anisotropic splits of the mechanical energy density as in [21] is rarely discussed in the literature. Hybrid formulations like the one proposed in [1] try to overcome the additional computational cost of these anisotropic models, again at the cost of sacrificing the variation structure.
All these approaches are slow in the sense that they have to solve global partial differential equations at each Newton or operatorsplitting iteration. When these methods use direct sparse solvers for the linear tangent problems, memory consumption can become problematic, too. At the same time, the problem of smallstrain phasefield brittlefracture has a lot of elegant variational structure; in particular, it fits directly into the rateindependent framework of
Mielke and Roubíček [22]. As a consequence, implicit time discretization leads to a sequence of coercive minimization problems for the deformation and damage fields together. These problems are not convex, but they are biconvex, i.e., convex (even strongly convex) in each variable separately. Pointwise inequality restrictions to handle the irreversibility of the fracture process as proposed in [21] reduce the smoothness of the objective functional, but do not otherwise influence its convexity or coercivity properties. The same holds for anisotropic energy splits based on linear quantities or the eigenvalues of the mechanical strain.In recent years, nonsmooth multigrid methods have shown to be able to solve nonsmooth problems from mechanics efficiently without the need for solving global problems [14, 26, 15]. This makes them vastly more efficient than operatorsplitting or Newtonbased methods. As there are no sparse matrix factorizations, memory consumption remains low. In addition, these methods can be shown to converge globally, i.e., from any initial iterate and for any load step. This works by exploiting the abovementioned variational structure, together with certain separability properties. As one such method, TNNMG can treat the pointwise constraints of the increment problems directly, i.e., without artificial penalization or simplifications by specific time discretizations. The idea is that TNNMG only needs to handle these constraints in a series of lowdimensional subproblems, each of which is easy to solve by itself. As a consequence, solving the problems with constraints is not appreciably slower than solving the corresponding unconstrained problem.
In this paper we show how the TNNMG method can be used to solve smallstrain brittle fracture problems. This involves in particular verifying that the increment functionals have the required convexity and smoothness properties. We show this for a range of different degradation functions and local crack surface densities (including both, the standard Ambrosio–Tortorelli2 functional and the more involved Ambrosio–Tortorelli1 functional). We also show the results for isotropic elastic energies and for energies with a spectral split, using the theory of spectral functions.
This paper is organized as follows: Chapter 2 discusses a framework of smallstrain phasefield models for brittle fracture, and shows the range of applicability of the TNNMG solver. Chapter 3 introduces the natural fully implicit time discretization, and proves existence of solutions for the spatial problems. In both chapters we pay particular attention to the mathematical properties of the energy functionals. In Chapter 4, finally, we introduce the TNNMG method. We explain its construction, discuss various algorithmic options, and prove that it converges globally to stationary points of the increment energy functional.
2. Phasefield models of brittle fracture
This chapter presents a range of phasefield models for brittle fracture, and discusses its smoothness and convexity properties.
Consider a deformable dimensional object represented by a domain . The deformation of such an object is characterized by a displacement field
. The object is supposed to exhibit smallstrain deformations and elastic material behavior only, and we therefore introduce the linearized strain tensor
. Following [21], we model the fracturing by a scalar damage field , where signifies intact material, and a fully broken one. Dirichlet boundary conditions can be posed both for the displacement and for the damage field. For this we select two not necessarily equal subsets of the domain boundary, and requirewhere and are two given functions.
Displacement and damage field evolve together, governed by a system of coupled nonsmooth partial differential equations. Disregarding inertia effects, we obtain a rateindependent system in the sense of Mielke and Roubíček [22]. Such a system can be written using the Biot equation
(1) 
where means the Gâteaux derivative with respect to the second and third arguments of , and is the convex subdifferential with respect to the second argument of the dissipation potential .
In this equation, is a potential energy, which we assume to be of the form
(2) 
The term is a degraded elastic energy density, and will be discussed in detail in Section 2.1. The term models the local crack surface density, and will be discussed in Section 2.2. The number is Griffith’s critical energy release rate, a material parameter. represents timedependent volume and surface forces, which drive the evolution. We assume that is linear and continuous in , and differentiable in with bounded time derivative.
The last term of (2) implements the restriction that the damage field can only assume values between and . For a set we define the indicator functional
For a closed, convex, nonempty set , the functional is convex, lower semicontinuous, and proper. Adding the constraint explicitly is not always necessary, as some fracture models lead to evolutions that satisfy the constraints implicitly. However, as pointwise bounds come with practically no cost when using the TNNMG solver, we do include them to extend our range of models.
To make the potential energy well defined, we will in general consider it on the firstorder Sobolev space . Incorporating the boundary conditions leads to the affine subspace
The second term of the Biot equation (1) is , where is the dissipation potential
(3) 
It implements the pointwise nonhealing condition as proposed by [21]. Note that is convex and lower semicontinuous, and . The fact that is positively 1homogeneous implies the rateindependence of the system.
Remark 2.1.
In the engineering literature, the same problem is frequently formulated as
with the rate potential
(4) 
This formulation is equivalent to the Biot equation (1) if the problem is sufficiently smooth.
We now in turn discuss the potential energy and the dissipation potential.
2.1. Degraded elastic energy density
We consider models that behave linearly elastic and isotropic if the material is in an undamaged state. That is, for the undamaged stored energy density we use the St. Venant–Kirchhoff material law, whose energy density is given by
with Lamé parameters and .
The undamaged energy density is split additively into a part that produces damage and another part that does not. The damageproducing part is then scaled by a socalled degradation function
and the energy density takes the form
(5) 
where is the space of symmetric matrices. The residual stiffness guarantees a wellposed problem in case of fracture.
Various different degradation functions have appeared in the literature [28, 17]. While the details vary, there appears to be agreement on the following properties:
Assumption 2.1.
The degradation function is differentiable, monotone decreasing, and fulfills and .
Note that several authors require in order to ensure that the evolution does not lead to values of larger than 1. We do not need this assumption here, because the pointwise constraint is enforced explicitly by the energy term (2).
The following specific degradation functions all fulfill Assumption 2.1:
Note that the functions and are strictly convex, but and are not even convex. For the rest of the paper we will restrict our considerations to convex twice continuously differentiable degradation functions .
Various splittings of have been proposed in the literature. We cover four common strainbased splittings taking the form (5).^{1}^{1}1The stressbased splitting of Steinke and Kaliske [28] is left for future work. All those splitting have the property that , and we will show that all have the following essential properties:

for all and for all , i.e., is differentiable with locally Lipschitz continuous derivative.

The gradient is semismooth for all .

The gradient is globally Lipschitz continuous uniformly in , i.e., there exists independent of such that for all matrices we have

is strongly convex uniformly in , i.e., there exists independent of such that for all matrices we have

is coercive uniformly in in the sense that there exists independent of such that .
We remind that the gradient is called semismooth if the limit
exists for any point and any direction . The set denotes Clarke’s generalized Jacobian of the locally Lipschitz continuous map at (cf. [25]). Notice that the strong convexity d implies strong monotonicity of , i.e.,
For the splittings considered in the following we will only prove a and b directly, and show that the simplified assumptions of the following lemma hold true. This then implies c, d, and e.
Proof.
Let and be the Lipschitz constants of and , respectively. Then is Lipschitz continuous with uniform Lipschitz constant , because .
To show strong convexity, we first note that is strongly convex on with a modulus independent of . Now consider the function
for . Since this is a weighted sum of two convex functions and with nonnegative weights and , it is itself convex. Thus, as a sum of this convex function and the strongly convex functions , the function is itself strongly convex and inherits the convexity modulus of . Finally, we note that with the same and we have
Despite those strong properties of we note that is not even convex in and together for any of the splittings considered below.
2.1.1. Isotropic splitting
In this model, any strain will lead to damage. The splitting is therefore
(6) 
Without proof, we note the following simple properties of the energy density defined by (5) and this splitting:
2.1.2. Volumetric decompositions
The isotropic model is unphysical, because it produces fracturing for all kinds of strain. In [18], Lancioni and RoyerCarfagni obtained better results by letting only the deviatoric strain contribute to the degradation. They introduced the split
with the deviatoric–volumetric strain splitting
With these definition, the energies are
(7) 
Lemma 3.
Proof.
The decomposition of Lancioni and RoyerCarfagni is still isotropic. Amor et al. [2] proposed to only degrade the expansive part of the volumetric strain. Using the ramp functions
that provide the decompositions and , they proposed the energy split
(8) 
where only the tensile volumetric strain contributes to damage.
Lemma 4.
Proof.
We first note that the squared ramp functions are convex, with derivatives having a global Lipschitz constant , and piecewise (in the sense of [29, Definition 2.19]). Hence the functions are also with globally Lipschitz gradients and piecewise , which shows a and (using Lemma 1) c. Being piecewise implies semismoothness b of [29, Proposition 2.26]. Noting that , convexity of the squared ramp functions furthermore implies that the functions are also convex and nonnegative, which by Lemma 1 provides d and e.
For the functional is quadratic and thus . In the case , if would be , then the function would also be . However, this function takes the form
and is thus piecewise quadratic but not in . ∎
2.1.3. Spectral decomposition
A more elaborate nonlinear splitting separating the tensile and compressive parts of the elastic energy was introduced in [21]. To define this splitting it is convenient to introduce the ordered eigenvalue function on the space of symmetric matrices, mapping any symmetric matrix
to the vector
containing its eigenvalues in ascending order. Using the ramp functions the tensile and compressive energies and are then defined as(9) 
Note that this indeed defines a splitting . For this splitting we will make the additional assumption that .
To quantify the properties of with respect to the strain tensor we use the theory of spectral functions. To this end we note that we can write as
with
(10) 
The functions are symmetric in the sense that does not depend on the order of the entries of . Having this form we can infer properties of the functions from properties of the symmetric functions .
Lemma 5.
Proof.
We will first show a–e. An essential ingredient is that the squared ramp functions are nonnegative, piecewise quadratic, and convex.
a The squared ramp functions and thus are . Now [24, Proposition 4.3] shows that the spectral functions are also . Hence the same applies to .
b The squared ramp functions are piecewise functions. Hence the gradients are piecewise functions (in the sense of [29, Definition 2.19]) and thus semismooth [29, Proposition 2.26]. Now [24, Proposition 4.5] provides semismoothness of and thus of .
c Since the functions are piecewise quadratic and the gradients are globally Lipschitz continuous. Now Corollary 43 of [3] provides global Lipschitz continuity of the gradients of the spectral functions in the more general context of Euclidean Jordan algebras (which includes the special case of symmetric matrices). In fact, the Lipschitz constant of equals the one for if is equipped with the Frobenius norm. Using Lemma 1 this implies uniform Lipschitz continuity of .
d,e Since the functions are weighted sums of convex, nonnegative squared ramp functions with nonnegative weights, they are convex and nonnegative themselves. Convexity of the functions then follows from [3, Theorem 41] while nonnegativity of those functions is trivial. Now Lemma 1 provides d and e.
To characterize second order differentiability of we first consider . Then coincides with the quadratic function and is thus . In the case , if would be , then the function for the fixed matrix with would also be . However, this function takes the form
and is thus piecewise quadratic but not in . ∎
Remark 2.2.
One can show that decomposes into finitely many disjoint subsets such that is twice continuously differentiable in the interior in each of these sets. A matrix is in the intersection of several if it either has an eigenvalue or if . While is not differentiable at those points, there are still generalized secondorder derivatives. For example, the generalized Jacobian in the sense of Clarke contains the derivatives of with respect to all the adjacent sets . Semismoothness essentially means that such generalized derivatives provide an approximation that can be exploited in a generalized Newton method.
Remark 2.3.
The additional assumption is essential for convexity of . To see this we consider for the line segment
Then, along this line segment, is quadratic and takes the form
which is strictly concave for and sufficiently small .
2.2. Crack surface density
The crack surface density function per unit volume of the solid is typically of the form [23]
with parameters , , and , and a parameter function . The internal length scale parameter controls the size of the diffusive zone between a completely intact and a completely damaged material. For the regularized crack surface yields a sharp crack topology in the sense of convergence. For a given function , the normalization constants and must be chosen such that the integral of over the fractured domain converges to the surface measure of the crack set as .
The function models the local fracture energy. Two types of local crack density functions appear in the literature. Double well potentials (as briefly reviewed in [1]), provide an energy barrier between broken and unbroken state, but will be disregarded here. Instead, we focus on the two widely used functionals
(11) 
and
(12) 
They are referred to in the literature as Ambrosio–Tortorelli (AT) functionals of type 1 and 2, respectively.
Some authors like [17] prefer because it has a local minimizer at . Thus, in the absence of mechanical strain, the unfractured solution is a minimizer of the total energy. As a result, no additional constraints need to be applied to ensure that . However, this argument becomes void when solver technology is available that can handle the explicit constraints . In contrast, for the AT1 functional we have in the intact state . Together with the constraint this leads to a threshold, i.e., a minimum load required to cause damage [23].
Kuhn et al. [17] proposed to regard the Ambrosio–Tortorelli functionals as special instances of the general family defined by
(13) 
with . The Ambrosio–Tortorelli functionals are obtained by setting for AT1 and for AT2. Further choices of are proposed in [23], which also do a detailed stability analysis for onedimensional problems.
We note the following properties of the functional in (13):
Lemma 6.
The function given in (13) has the following properties:

It fulfills and .

It is strictly monotone increasing on for all .

It is convex for all , and strictly convex for all .
For the rest of the paper we will assume the takes the form (13) with such that is guaranteed to be convex and quadratic.
3. Discretization and the algebraic increment potential
We use a fully implicit discretization in time, and Lagrange finite elements for discretization in space. Note that timediscretizations using a local energy history field [20] are typically only semiimplicit (cf. Section Appendix: Reformulation with a history field). By using a fully implicit timediscretization we retain the variational structure of the problem. Most of this chapter is spent investigating the properties of the increment functional.
3.1. Time discretization
It is shown in [22] that there is a natural time discretization for (1) that consists of sequences of minimization problems. Let the time interval be subdivided by time points , . We obtain a timediscrete formulation by integrating the rate potential given in (4) along paths on . Given initial values at , the continuous solution on this interval minimizes in the set of all paths of sufficient smoothness with [22]. We then set to be . Defining as the set of suitably smooth paths in from to , and using that , this can be written as
The last integral term is the length of with respect to the Finsler norm given by . Minimizing over all paths gives the distance
Then one step of the time discretization scheme is given by
In our particular case (3), does not explicitly depend on . The dissipation distance is then easily computed as [22, Example 3.2.5]
We obtain the minimization problem
(14) 
with the increment potential
(15) 
Note that the time step size does not appear in this functional, because the model is rateindependent. Note also that the increment potential depends on the previous time step only through the indicator functional.
Lemma 7.
Assume that is nontrivial in the sense that its dimensional Hausdorffmeasure is positive. Then the functional is coercive on .
Proof.
Using the uniform coercivity e of , , and for we get
for some constant . Using Korn’s inequality for , the Poincaré inequality for , and the fact that grows at most linearly we get for another constant
where we have used that the constraint implies in the second inequality. ∎
Lemma 8.
The functional is weakly lower semicontinuous on .
Proof.
Since weak lower semicontinuity of the other terms in follows from convexity and lower semicontinuity of the integrands, we only need to consider the nonconvex term
(16) 
To this end we note that (16) can be written as for
and the density given by
Since is a Carathéodory function, nonnegative (and thus uniformly bounded from below), and convex in for all , it satisfies the assumptions of Theorem 3.4 in [7].
Now let be a weakly convergent sequence in . Then, by the compactness of the embedding into we get
Furthermore, the weak convergence of implies weak convergence of
because is in for each . Now Theorem 3.4 of [7] provides
As a direct consequence of coercivity and weal lower semicontinuity we get existence of a minimizer of the increment functional:
Theorem 3.1.
There is a solution to the minimization problem (14), i.e., there exists a global minimizer of .
3.2. Finite element discretization
The increment problem (14) of the previous section is posed on the pair of spaces for the displacements and for the damage variable. Let be a conforming finite element grid for . We discretize the function spaces by standard firstorder Lagrangian finite elements. In order to derive an algebraic form of the discretized increment functional we make use of the standard scalar nodal basis associated to the grid nodes . Identifying the valued and scalar finite element functions and with their coefficient vectors and , respectively, we write
where and . For the integration we use two kinds of quadrature rules: Integrals of smooth nonlinear terms over a grid element are approximated using a higherorder quadrature rule , while the integral over the nonsmooth term is approximated using the grid nodes as quadrature point, which is ofter referred to as lumping. Using these approximations we obtain the algebraic increment functional given by
(17) 