1. Introduction
On a real Hilbert space with inner product and induced norm , we consider a nonlinear operator , where denotes the dual space of . We aim to generate optimally converging (with respect to the number of degrees of freedom and the total computational time) finite element approximations of the following nonlinear operator equation:
(1) 
In weak form, this problem reads:
(2) 
here, signifies the duality pairing on . For the purpose of this work, we suppose that satisfies the following conditions:

Lipschitz continuity: There exists a constant such that

Strong monotonicity: There exists a constant such that

The operator possesses a potential, i.e., there exists a Gâteaux differentiable (energy) functional such that .
Assuming (F1) and (F2), the main theorem of strongly monotone operators guarantees that (1) (or equivalently (2)) has a unique solution ; see, e.g., [Neč86, §3.3] or [Zei90, Thm. 25.B].
1.1. Iterative linearization
Following the recent approach [HW20a], we apply a general fixed point iteration scheme for (1). For given , we consider a linear and invertible preconditioning operator . Then, the nonlinear problem (1) is equivalent to . For any suitable initial guess , this leads to the following iterative scheme:
(3) 
Note that the above iteration is a linear equation for , thereby rendering (3) an iterative linearization scheme for (1). Given , the weak form of (3) is based on the bilinear form , for , and the solution of (3) can be obtained from
(4) 
Throughout, for any , we suppose that the bilinear form is uniformly coercive and bounded, i.e., there are two constants independent of such that
(5) 
and
(6) 
For any given , owing to the Lax–Milgram theorem, these two properties imply that the linear equation (4) admits a unique solution .
1.2. Iterative linearized Galerkin approach (ILG)
In order to cast (4) into a computational framework, we consider closed (and mainly finite dimensional) subspaces , endowed with the inner product and norm on . Then, ILG [HW20a] is based on restricting the weak iteration scheme (4) to . Specifically, for a prescribed initial guess , define a sequence inductively by
(7) 
Note that (7) has a unique solution, since the conditions (5) and (6) above remain valid for the restriction to . For the purpose of the convergence results in this paper, we introduce a monotonicity condition on the functional from (F3).

There exists a (uniform) constant such that for any closed subspace , the sequence defined by (7) fulfils the bound
(8) where is the potential of (restricted to ) introduced in (F3).
If satisfies (F1)–(F3) and if the ILG bilinear form from (4) is coercive (5) with coercivity constant , then (F4) is fulfilled; see [HW20b, Prop. 2.1]. Moreover, upon imposing alternative conditions, we may still be able to satisfy (8) even when . For instance, [HW20a, Rem. 2.8] proposed an a posteriori step size strategy that guarantees the bound (8) in the context of the damped Newton method. This argument can be generalized to other methods containing a damping parameter. Moreover, the Kačanov linearization approach (see (22) below) satisfies (8) with without requiring ; see, e.g., [HW20a, Eq. (18)].
1.3. Adaptive iterative linearized finite element method (AILFEM)
We restrict ILG (7) to a sequence of Galerkin subspaces with corresponding sequences for given by
(9) 
The spaces will be conforming finite element spaces associated to admissible triangulations of an underlying bounded Lipschitz domain , with . AILFEM (Algorithm 1 below) exploits an interplay of adaptive mesh refinements and ILG (9) and chooses the initial guesses and for all and appropriate indices .
1.4. Goal of the paper
The work [GHPS18] has recently analyzed the AILFEM for the solution of (1) in the specific context of a Zarantonello linearization approach originally proposed in [CW17]. The results of [GHPS18] have been extended to contractive iterative solvers in [GHPS20]. In particular, it has been shown that an optimal convergence rate with respect to the number of degrees of freedom as well as with respect to the overall computational cost can be achieved. The purpose of the present note is to show that these optimality results from [GHPS20] also apply to AILFEM beyond the Zarantonello linearization approach (cf. [HW20b] for some convergence results in abstract spaces). Especially, we will exploit (F4) to establish an energy contraction property for the fixed point iteration (3), which constitutes the crucial ingredient in the analysis of [GHPS20]. In particular, we will revisit a few example AILFEM schemes from [HW20a], and verify our theoretical findings numerically.
2. Energy contraction
In this section, we will establish an energy contraction result for ILG (7). To this end, let be a closed linear subspace of as in §1.2. Moreover, denote by the unique solution of the equation
(10) 
where existence and uniqueness of follows again from the main theorem of strongly monotone operators.
Proposition 2.1.
Consider the sequence generated by the iteration (7). If (F1)–(F4) are satisfied and fulfils (5)–(6) for any , then there holds the energy contraction property
(11) 
with a contraction constant
(12) 
independent of the subspace and of the iteration number . In addition, the sequence converges to the unique solution of (10).
Proof.
With (F2) and since is the (unique) solution of (10), for , we first observe that
Recalling that , (7) and (6) prove that
Altogether, we derive the a posteriori
error estimate
(13) 
Next, we exploit the structural assumptions (F1)–(F3) to obtain the wellknown inequalities
(14) 
see, e.g., [HW20b, Lem. 2.3] or [GHPS18, Lem. 5.1]. For any , we thus infer that
This proves, in particular, that . Finally, iterating this inequality, we obtain that
In particular, we conclude that in as . ∎
3. Adaptive finite element discretization
In this section, we thoroughly formulate AILFEM. For each , the space in (9) will be a conforming finite element space that is associated to an admissible triangulation of an underlying bounded Lipschitz domain , with . Following the recent approach [GHPS20], we will present an adaptive iterative linearization finite element algorithm that exploits the interplay of adaptive mesh refinements and the iterative scheme (9). Throughout this section, we will always assume that satisfies (F1)–(F4) and that (5) and (6) hold true.
3.1. Mesh refinements
We adopt the framework from [GHPS20, §2.2–2.4], with slightly modified notation. Consider a shaperegular mesh refinement strategy such as, e.g., newest vertex bisection [Mit91]. For a subset of marked elements in a regular triangulation , let be the coarsest regular refinement of such that all elements have been refined. Specifically, we write for the set of all possible meshes that can be generated from by (repeated) use of . For a mesh , we assume the nestedness of the corresponding finite element spaces and , respectively, i.e., . In the sequel, starting from a given initial triangulation of , we let be the set of all possible refinements of .
With regards to the optimal convergence rate of the algorithm with respect to the overall computational cost (see the next section §4), a few assumptions on the mesh refinement strategy are required, cf. [GHPS20, §2.8]. These are satisfied, in particular, for newest vertex bisection.

Splitting property: Each refined element is split into at least two and at most many subelements, where is a generic constant. In particular, for all triangulations , and for any , the (onelevel) refinement satisfies
Here, is the set of all elements in which have been refined in , and comprises all unrefined elements.

Overlay estimate: For all meshes , and refinements there exists a common refinement denoted by
which satisfies .

Meshclosure estimate: There exists a constant such that, for each sequence of successively refined meshes, i.e., for some , it holds that
3.2. Error estimators
For a mesh associated to a discrete space , suppose that there exists a computable local refinement indicator , with for all and . Then, for any and , let
(15) 
We recall the following axioms of adaptivity from [CFPP14] for the refinement indicators: There are fixed constants and such that, for all and , with associated refinement indicators and , the following properties hold, cf. [GHPS20, §2.8]:

Stability: , for all and all .

Reduction: , for all .

Discrete reliability: , where is the solution of (10) on the discrete space associated with .
3.3. AILFEM algorithm
We recall the adaptive algorithm from [GHPS18] (and its generalization [GHPS20]), which was studied in the specific context of finite element discretizations of the Zarantonello iteration. These algorithms are closely related to the general adaptive ILG approach in [HW20a]. The key idea is the same in all algorithms: On a given discrete space, we iterate the linearization scheme (9) as long as the linearization error estimator dominates. Once the ratio of the linearization error estimator and the a posteriori error estimator falls below a prescribed tolerance, the discrete space is refined appropriately.
Determine a marking set with minimal cardinality (up to the multiplicative constant ) such that , and set . 
Remark 3.1.
Under the conditions (A1) and (A3), we notice some facts about Algorithm 1 from [GHPS20, Prop. 3] and [GHPS18, Prop. 4.5 & 4.6]: For any , there holds the a posteriori error estimate
(16) 
where depends only on , and . In particular, if the repeat loop terminates with for some , then , i.e., the exact solution is obtained. Moreover, in the nongeneric case that the repeat loop in Algorithm 1 does not terminate after finitely many steps, for some , the generated sequence converges to (in particular, the solution is discrete).
4. Optimal convergence
We are now ready to outline the linear convergence of the sequence of approximations from Algorithm 1 to the unique solution of (1). In addition, the rate optimality with respect to the overall computational costs will be discussed.
4.1. Step counting
Following [GHPS20], we introduce an ordered index set
where counts the number of steps in the repeat loop for each . We exclude the pair from , since either and or even if the loop does not terminate after finitely many steps; see also Remark 3.1. Observing that Algorithm 1 is sequential, the index set is naturally ordered: For , we write if and only if appears earlier in Algorithm 1 than . With this order, we can define the total step counter
which provides the total number of solver steps up to the computation of . Finally, we introduce the notation .
4.2. Linear convergence
Based on the notation used in Algorithm 1, let us introduce the quasierror
(17) 
and suppose that the estimator satisfies (A1)–(A3). Then, appealing to the energy contraction property from Proposition 2.1, we make the crucial observation that ILG (7) satisfies the contraction property (C1) from [GHPS20]. Consequently, [GHPS20, Thm. 4] directly applies to our setting:
4.3. Optimal convergence rate and computational work
Furthermore, we address the optimal convergence rate of the quasierror (17) with respect to the degrees of freedom as well as the computational work. As before, we can directly apply a result from [GHPS20] owing to the energy contraction from Proposition 2.1. For its statement, we need further notation: First, for , let be the set of all refinements of with . Next, for , define
where is the discrete solution (10) on the finite element space related to an optimal (in terms of the above infimum) mesh . For , we note that if and only if the quasierror converges at least with rate along a sequence of optimal meshes.
Theorem 4.2 ([Ghps20, Thm. 7]).
Suppose (R1)–(R3) and (A1)–(A4), and define
Let and such that
Then, for any , there exist positive constants such that
(18) 
Here, the constant depends only on , and , and additionally on and , respectively, if or for some , respectively; moreover, the constant depends only on , and .
The significance of (18) is that the quasierror from (17) decays at rate (with respect to the number of elements or, equivalently, the number of degrees of freedom) if and only if rate is achievable for the discrete solutions on optimal meshes (with respect to the number of elements). If, in addition, all of the (single) steps in Algorithm 1 can be performed at linear cost, , then the quasierror even decays with rate with respect to the overall computational cost if and only if rate is attainable with respect to the number of elements. We note that the total computational cost is proportional to the total computational time, which is therefore monitored in the subsequent numerical experiment.
5. Numerical experiment
In this section, we test Algorithm 1 with a numerical example.
5.1. Model problem
On an open, bounded, and polygonal domain with Lipschitz boundary , we consider the quasilinear secondorder elliptic boundary value problem:
(19) 
Here, is the standard Sobolev space of functions on with zero trace along . For , the inner product and norm on are defined by and , respectively. We suppose that in (19) is given, and the diffusion parameter fulfils the monotonicity property
(20) 
with constants . Under this condition, the nonlinear operator from (19) satisfies (F1) and (F2) with and ; see [Zei90, Prop. 25.26]. Moreover, has a potential given by
where , . The weak form of the boundary value problem (19) reads:
(21) 
For the nonlinear boundary value problem (19), [HW20a, §5.1] proves the convergence of a few particular iterative linearization schemes, which can be cast into the general approach (4). These include the following:

Newton iteration, for a damping parameter :
Here, for , the Gâteaux derivative of is given through
5.2. Discretization and local refinement indicator
AILFEM for (21) is based on regular triangulations that partition the domain into open and disjoint triangles such that . Moreover, we consider associated finite element spaces , where we signify by the space of all affine functions on . The mesh refinement strategy in Algorithm 1 is given by newest vertex bisection [Mit91]. Moreover, for any and any , we define the local refinement indicator respectively the global error indicator from (15) by
(23) 
where signifies the normal jump across element faces, and is equivalent to the diameter of . This error estimator satisfies the assumptions (A1)–(A4) for the problem under consideration; see, e.g.,[GMZ12, §3.2] or [CFPP14, §10.1].
5.3. Computational example
Consider the Lshaped domain , and the nonlinear diffusion parameter , which satisfies (20) with and . Moreover, we choose such that the analytical solution of (19) is given by
where and are polar coordinates; this is the prototype singularity for (linear) secondorder elliptic problems with homogeneous Dirichlet boundary conditions in the Lshaped domain; in particular, we note that the gradient of is unbounded at the origin.
In all our experiments below we set the adaptive mesh refinement parameters to , and . The computations employ an initial mesh consisting of 192 uniform triangles, and with the starting guess . Then, the procedure is run until the number of elements exceeds . We always choose the damping parameter for the Newton iteration, and vary the damping parameter for the Zarantonello iteration, as well as the adaptivity parameter , cf. line 6 in Algorithm 1, throughout the experiments. Our implementation is based on the Matlab package [FPW11] with the necessary modifications.
In general, for the Newton scheme, we note that choosing the damping parameter to be (potentially resulting in quadratic convergence of the iterative linearization close to the solution) might lead to a divergent iteration for the given boundary value problem (cf. [AW14]). Our numerical computations illustrate, however, that this is not of concern in the current experiments. Indeed, for , the bound (8) from (F4) remains satisfied in each iteration. Otherwise, a prediction and correction strategy which obeys the bound (8), could be employed (see [HW20a, Rem. 2.8]). This would guarantee the convergence of the (damped) Newton method.

and : In Figure 1, we display the performance of Algorithm 1 with respect to both the number of elements and the total computational time. We clearly see a convergence rate of for the Kačanov and Newton method, which is optimal for linear finite elements. Moreover, the Zarantonello iteration has a preasymptotic phase of reduced convergence, which becomes optimal for finer meshes. In Figure 2 (left) we observe that the energy contraction factor given by
(24) is inferior for the Zarantonello iteration in the initial phase (compared to the Kačanov and Newton methods), which might explain the reduced convergence. This contraction factor becomes better for an increased number of iterations, see Figure 2 (right), thereby leading to the asymptotically optimal convergence rate for the Zarantonello iteration. Finally, in Figure 3 (left), we display the quotient
(25) which experimentally verifies the assumption (F4).
Figure 1. and : Performance plot for adaptively refined meshes with respect to the number of elements (left) and the total computational time (right). The solid and dashed lines correspond to the estimator and the error, respectively. The dashed lines without any markers indicate the optimal convergence order of for linear finite elements. Figure 2. and : The contraction factor (left, see (24)) and the number of iterations (right) on each finite element space. Figure 3. The quotient from (25) on each finite element space for and (left) and and (right), respectively. 
and : As before, in Figure 4, we display the performance of Algorithm 1 with respect to both the number of elements and the total computational time. We clearly observe the optimal convergence rate of for all of the three iteration schemes presented above from the initial mesh onwards. In contrast to the experiment before, the energy contraction factor from (24) is now of comparable size for all the iteration schemes, as we can see from Figure 5 (left). Moreover, the number of iterations does not significantly differ for the three iterative methods. Again, we plot in Figure 3 (right) the quotient (25) for the numerical evidence of the assumption (F4).
Figure 4. and : Performance plot for adaptively refined meshes with respect to the number of elements (left) and the total computational time (right). The solid and dashed lines correspond to the estimator and the error, respectively. The dashed lines without any markers indicate the optimal convergence order of for linear finite elements. Figure 5. and : The contraction factor (left, see (24)) and the number of iterations (right) on each finite element space. 
and : Once more, we observe optimal convergence rate for all our three iteration schemes with respect to both the number of elements and the total computational time, see Figure 6. The total computational times obtained, however, differ noticeably, see Figure 6 (right), as a consequence of the varying number of iterative linearization steps of the three methods, see Figure 7 (right). In contrast, the energy contraction factor from (24) almost coincides for the different iteration schemes, see Figure 7 (left), which is due to the small adaptivity parameter .
Figure 6.
Comments
There are no comments yet.