Adaptive local minimax Galerkin methods for variational problems

by   Pascal Heid, et al.
Universität Bern

In many applications of practical interest, solutions of partial differential equation models arise as critical points of an underlying (energy) functional. If such solutions are saddle points, rather than being maxima or minima, then the theoretical framework is non-standard, and the development of suitable numerical approximation procedures turns out to be highly challenging. In this paper, our aim is to present an iterative discretization methodology for the numerical solution of nonlinear variational problems with multiple (saddle point) solutions. In contrast to traditional numerical approximation schemes, which typically fail in such situations, the key idea of the current work is to employ a simultaneous interplay of a previously developed local minimax approach and adaptive Galerkin discretizations. We thereby derive an adaptive local minimax Galerkin (LMMG) method, which combines the search for saddle point solutions and their approximation in finite-dimensional spaces in a highly effective way. Under certain assumptions, we will prove that the generated sequence of approximate solutions converges to the solution set of the variational problem. This general framework will be applied to the specific context of finite element discretizations of (singularly perturbed) semilinear elliptic boundary value problems, and a series of numerical experiments will be presented.



There are no comments yet.


page 1

page 2

page 3

page 4


Discontinuous Galerkin Finite Element Methods for the Landau-de Gennes Minimization Problem of Liquid Crystals

We consider a system of second order non-linear elliptic partial differe...

Galerkin Neural Networks: A Framework for Approximating Variational Equations with Error Control

We present a new approach to using neural networks to approximate the so...

Galerkin Finite Element Method for Nonlinear Riemann-Liouville and Caputo Fractional Equations

In this paper, we study the existence, regularity, and approximation of ...

A numerical energy minimisation approach for semilinear diffusion-reaction boundary value problems based on steady state iterations

We present a novel energy-based numerical analysis of semilinear diffusi...

Data-driven Evolutions of Critical Points

In this paper we are concerned with the learnability of energies from da...

Adaptive non-intrusive reconstruction of solutions to high-dimensional parametric PDEs

Numerical methods for random parametric PDEs can greatly benefit from ad...

Galerkin Methods for Complementarity Problems and Variational Inequalities

Complementarity problems and variational inequalities arise in a wide va...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Consider a (real) Hilbert space , equipped with an inner product and an induced norm . Given a nonlinear operator , where signifies the dual space of , we focus on solutions of the equation


This problem is variational if there exists an underlying functional such that solutions of (1) arise as critical points of , i.e. if they satisfy the Euler-Lagrange equation


with denoting the Fréchet derivative of . A solution to the Euler-Lagrange equation (2) is called a critical point, and the value of the functional at a critical point is termed critical value.

The purpose of this paper is to provide a new adaptive algorithm for the numerical solution of (2), which exploits the theoretical framework of the mountain pass critical point theory, in combination with adaptive Galerkin discretizations. The key idea is to exploit an automatic and simultaneous interplay of these two approaches in order to design a highly effective numerical approximation procedure. This will be illustrated in the specific context of classical adaptive finite element discretizations of singularly perturbed semilinear partial differential equations (PDE); such problems have wide ranging applications in practice (including, e.g., nonlinear reaction-diffusion in ecology and chemical models [8, 14, 17, 28, 29], economy [5], or classical and quantum physics [6, 31]), however, they are notoriously challenging to solve from the numerical view point.

Minimax theory

Basic critical point theory in the calculus of variations pays attention to critical points that are either local minima or maxima of a given functional . Many solutions to nonlinear variational problems of practical relevance, however, occur as unstable critical points, i.e. they are neither a (local) maximum nor minimum. Such unstable critical points are called saddle points: More precisely, a saddle point of a functional is an element such that , and for any (open) neighbourhood of there are with

In the minimax theory of Ambrosetti-Rabinowitz [2], see also [30, §2], saddle points appear as solution to a two-level optimization problem of the form

where is a collection of subsets of . In this context, a central result for the existence of (multiple) critical points, especially of saddle points, is the mountain pass theorem. It is based on the so-called Palais-Smale compactness condition of the functional :

  1. Any sequence for which is a bounded sequence in , and in , as , possesses a convergent subsequence.

Theorem 1.1 (Mountain pass theorem).

Let be a (real) Hilbert space and satisfying the Palais-Smale condition (PS). Suppose that

  1. ;

  2. there are constants such that for all with ;

  3. there exists an element with such that .

Then, possesses a critical value , which can be expressed by

where .

Mountain pass type numerical algorithms

The importance of saddle points appearing in natural science applications has raised a high demand for non-standard numerical approaches for nonlinear variational problems. Although this endeavour turns out to be tremendously challenging in practice, it seems natural to apply the framework provided by the minimax theory, and, in particular, to exploit the analytical foundation of the mountain pass theorem for the purpose of designing numerical solution algorithms for (2). This route was first pursued by Choi and McKenna in their pathbreaking paper [9]. In conjunction with its theoretical counterpart, their scheme is widely known as the Mountain Pass Algorithm (MPA). The core of the method is an iterative steepest descent procedure which takes care of finding a minimum along a local mountain range of . This part of the algorithm is expressed in terms of a linear equation; in the context of nonlinear PDE, for example, this problem can be solved by traditional numerical discretization schemes such as the finite element method (FEM).

Although the work of Choi and McKenna can certainly be considered a milestone in the development of numerical solution schemes for nonlinear variational problems, many issues have remained open. For instance, their paper [9] does not contain any error or convergence analysis of the proposed MPA. Moreover, the MPA will typically find critical points of Morse index 0 or 1 only. Several subsequent papers have used the MPA approach in order to achieve further progress on the topic: In the article [12], for example, a numerical algorithm to compute sign-changing solutions of the semilinear elliptic PDE


subject to Dirichlet boundary conditions, was proposed. The main idea is to construct a local link from a known critical point to a new critical point. The resulting high-linking algorithm (HLA) presumes that a mountain pass solution is already available, and the MPA [9]

is applied as part of the scheme. This work showed that the HLA is able to generate sign-changing solutions for non-symmetric domains and odd linearities, which could not be found by the original MPA. Moreover, in the special case of symmetric domains, the paper 

[11] has focused on sign-changing solutions of (3). The approach is based on a modified Mountain Pass Algorithm (MMPA), which applies a restriction of the underlying functional to the fixed point set of certain compact topological groups representing the symmetry of the domain. Still, no convergence analysis was done by then.

Local minimax approach

Further progress in the development of numerical schemes for nonlinear variational problems with multiple solutions was made by Li and Zhou in their paper [24]. Beginning with a set of already known critical points, the idea is to use solution-submanifolds of so-called peak selection mappings, whose local minima occur as new unstable critical points of the underlying functional. One of the crucial advantages of the local minimax (LMM) algorithm proposed by Li and Zhou is that the generated sequence of approximated solutions exhibits a decay of the associated energy functional. Moreover, the LMM algorithm may find critical points of Morse index greater than 1. In their subsequent paper [25], Li and Zhou have introduced a new step-size rule for the LMM procedure leading to the modified local minimax algorithm. Moreover, a first convergence analysis has been derived; remarkably, under certain assumptions, the LMM algorithm is able to generate approximation sequences which contain converging subsequences to a critical point of the energy functional. In addition, for initial guesses sufficiently close to an isolated critical point, the authors have proved that the iteration converges precisely to that point. This result was improved further by Zhou in [37, Thm. 2.4]; we note that the proof of that result can be adapted in such a way that it yields the convergence of the generated sequence to the set of critical points.

The original local minimax approach by Li and Zhou [24, 25] has been studied, modified, generalized, and applied in various subsequent papers by Zhou, and other authors. We mention the work [34] where a modified LMM method to find multiple solutions of singularly perturbed semilinear PDE with Neumann boundary conditions has been presented. In that paper, the authors have proposed an ’ad hoc’ computational idea on how local mesh refinements in the framework of finite element Galerkin spaces, with a special emphasis on the resolution of spike layers, may be applied. Specifically, after a (fixed) number of steps in the LMM, mesh refinements are performed whenever the residual is not sufficiently small; the refinements, in turn, are performed simply by subdividing any elements where the numerical solution tends to form a spike. A further article [35] has paid special attention to the convergence analysis of the LMM scheme on a finite dimensional Galerkin space. Within this setting, it has been proved that the generated discrete sequence possesses convergent subsequences. Furthermore, for finite element discretizations of semilinear elliptic equations, as the mesh size tends to zero, it has been shown that the subset of solutions which can be approximated by the LMM algorithm converges to a subset of solutions of the original problem. This does not, however, imply the convergence of the generated sequence to a solution of the problem. Finally, we point to the work [33] which offers a modification of the LMM method based on applying a projection onto subspaces with certain symmetry properties; this, in turn, allows to find saddle points with corresponding symmetric features (including critical points of higher Morse index).


The aim of the current work is to provide a numerical approximation procedure for saddle points of a functional , which is based on a simultaneous interplay of the LMM approach proposed in [24] and adaptive Galerkin space enrichments. This idea follows the recent developments on the (adaptive) iterative linearized Galerkin (ILG) methodology [21, 22, 10, 3, 4, 23], whereby adaptive discretizations and iterative linearization solvers are combined in an intertwined way; we also refer to the closely related works [18, 16, 15, 7, 19, 20].

A key building block of the numerical scheme to be presented in this paper concerns the decision of whether hierarchical Galerkin space enrichments or LMM iterations on the current discrete space should be given preference. This is accomplished by estimating the residual on a given Galerkin space in terms of a computable indicator. Once the residual is found sufficiently small, we conclude that any further LMM iteration will not significantly reduce the residual on the present discrete space. Consequently, by making use of local residual indicators, we will hierarchically enrich the Galerkin space. On the theoretical side, under certain assumptions, we prove that the sequence generated by the adaptive LMM Galerkin (LMMG) algorithm converges to the set of critical points of


Special attention will be given to (singularly perturbed) semilinear elliptic PDE in the context of standard finite element discretizations. We will apply the approach presented in [32], which has been developed for linear elliptic problems, in order to derive a posteriori residual bounds for the LMMG algorithm that are robust with respect to the singular perturbation parameter; see also [4] for related results in the context of Newton-type linearizations of semilinear singularly perturbed PDE. Our numerical tests display optimal convergence rates with respect to the number of elements in the mesh.

Outline of the paper

We will briefly recall the main concepts of the local minimax algorithm from Li and Zhou in §2, and present some summarized results from [24, 25, 35]. Furthermore, the focus of §3 is on the new (abstract) adaptive LMMG algorithm that exploits an interplay between the classical LMM method and adaptively enriched general Galerkin discretizations. In addition, under some reasonable assumptions, we will prove that the approximated discrete solution sequence generated by the proposed LMMG procedure converges to the set of solutions of the original problem. Then, in §4, we will show that our general theory applies to a class of singularly perturbed semilinear elliptic boundary value problems. A series of numerical experiments will be presented in §5.

2. The local minimax approach by Li and Zhou

In this section, we revisit the local minimax method introduced in [24, 25]. For the convenience of the reader, we will recall the relevant definitions, and point to some existing results. In the sequel, let be a given functional that satisfies the Palais-Smale compactness condition (PS).

2.1. Peak selection

Let be any closed subspace, and denote by its orthogonal complement with respect to the -inner product: . Then, for any in the sphere , we define the closed half space

For given , a point is called a local maximum of in if there exists such that for all with ; the set of all local maxima of in is signified by .

A mapping with for all is called a peak selection of (with respect to ). Furthermore, for a point , we say that has a local peak selection at (with respect to ) if there exists and a mapping with for all in the domain of the local mapping .

The following observation about local peak selections is instrumental in view of an algorithmic development of the mountain pass theory.

Proposition 2.1 (Theorem 2.1 of [24]).

Suppose that has a local peak selection at some point (with respect to ). Furthermore, let the following conditions hold true:

  1. is continuous at ;

  2. it holds for some constant ;

  3. is a local minimum point of on .

Then, is a critical point of .

For a given peak selection of (with respect to ), we consider the image of given by

Then, under the assumptions (a)–(c), the above result shows that a local minimizer of in is a critical point of in .

2.2. Local minimax scheme

We will briefly outline the main ideas of the minimax algorithm from [25, §2]. To this end, for , consider previously known critical points ordered in such a way that


and define the linear subspace

We aim to find a new critical point by pursuing the following procedure:

  1. For a given , we suppose that there is such that


    for some .

  2. Now we find a local minimum of in a vicinity of . This is accomplished by moving along the steepest descent direction, , of at , given by

    where signifies the dual product. Note that this is a linear problem. Then, for a suitable step size , we replace by a new direction


    where, for , we define


    Referring to [25, Lem. 1.2], under the assumptions (a) and (b) in the above Proposition 2.1, if is not a critical point of (i.e. ), then, for any with , there is such that it holds the descent property

    Hence, selecting the step size in (6) appropriately, and letting


    for some and , we obtain .

Upon repeating step (ii), we obtain a sequence such that is strictly monotonically decreasing with respect to . The following result, which summarizes [25, Thm. 3.1 and 3.2], attends to the convergence of the above process.

Proposition 2.2.

Let be a peak selection of with respect to , and suppose that satisfies the Palais-Smale condition (PS). If

  1. is continuous,

  2. for all , for some constant , and

  3. ,

then has a converging subsequence. Moreover, any such convergent subsequence tends to a critical point of .

For the above iterative scheme, this theorem asserts that we can find large enough such that is sufficiently small. We then let , and restart the search for a new critical point.

Remark 2.3.

It is sensible to choose in (5) to be an ascent direction of at , i.e.


for any sufficiently small. Indeed, moving along an ascent direction works in favour of assumption (b) in Proposition 2.2 (see also Theorem 3.2 below).

3. An adaptive local minimax Galerkin method

The purpose of this section is to provide a practical scheme for the approximation of the nonlinear equation (1). To this end, we employ a sequence of finite-dimensional Galerkin subspaces , , with the hierarchical property . In order to deal with the nonlinearity of , based on a suitable a posteriori error analysis, we introduce an adaptive interplay between the minimax scheme from §2.2 and the Galerkin discretizations. We thereby obtain an iterative minimax Galerkin method.

3.1. Galerkin discretization

For , the Galerkin discretization of (2) is to find discrete approximations such that

where, for , we define the discrete residual by


Denoting by the sequence generated by the local minimax procedure from §2.2 on a given Galerkin space , it holds that


for ; see [35, Thm. 4.1]. Hence, if we consider a given positive function with

then, for any , there is such that for all .

3.2. Adaptive local minimax Galerkin procedure

The main idea of the Algorithm 1 in the current work is to provide an adaptive interplay between the following two strategies:

  1. Mountain pass approximation: On a given Galerkin space , for , we run the local minimax procedure from §2.2 on until the resulting approximations, , , are sufficiently close to a zero of . Let us denote by the final approximation on the present Galerkin space . From the point of view of computational complexity, we note that the core part of the minimax Galerkin discretization is the solution of the linear problem (10), together with the application of the peak selection in each iteration step.

  2. Adaptive Galerkin discretization: Once the norm of the residual obtained from step (I), i.e. , is small enough, we enrich the Galerkin space appropriately. This is based on the assumption that we have at our disposal a computable error indicator such that


    for some constant (independent of ); here, the dual norm is defined by


    Furthermore, we suppose that comprises of local error contributions which allow to refine the Galerkin space effectively.

This iterative Galerkin scheme is outlined in Algorithm 1.

Remark 3.1.

We comment on two aspects of the discrete minimax scheme in step (I).

  1. The stopping criterion for the iteration is expressed in terms of the inequality


    where is a prescribed method parameter. Without loss of generality, for any , we may assume that there is indeed a finite such that (14) is satisfied. If not, then, due to (12) and (11), we conclude that

    for any . This, in turn, implies that the sequence of discrete approximations on converges to the set of critical points of , given by


    in the sense that ; this can be shown along the lines of Step 2 of the proof of Theorem 3.2 below.

  2. Following [25, p. 870], the step size for the update of the ascent direction from (7), applied on the Galerkin space , is defined by


    where is a step size control parameter, and is the minimal integer such that



    Here, and are the steepest descent direction and updated ascent direction from (18) and (19), respectively.

1:Input previously found critical points of as in (4).
2:Prescribe a steering parameter , a step size control , a tolerance .
3:Start with an initial Galerkin space , and set .
4:Let , with suitable approximations of in , respectively.
5:Choose to be an (ascent) direction at , cf. (9).
7:      Set .
8:      Let , where signifies a suitable approximation of in , for .
9:      Use a peak selection on , and determine .
10:     while  or  do
11:          Compute the steepest descent direction of at the point , i.e. solve the linear discrete problem (10) to define
12:          Set
13:          Compute from (16).
14:          Set , and determine , for unique and , cf. (8).
15:          Update .
16:     end while
17:      Enrich the Galerkin space appropriately using the local error indicator .
18:      Define and by inclusion .
19:      Update .
20:until .
Algorithm 1 Adaptive local minimax Galerkin (LMMG) algorithm
Theorem 3.2.

Let be the sequence of finite dimensional subspaces of generated by Algorithm 1, and , for , be the closed subspace from Line 8 in Algorithm 1. Suppose that the functional satisfies the Palais-Smale condition (PS), and, for each , the peak selection of with respect to fulfils the following properties:

  1. is continuous,

  2. for all , and

  3. .

We further assume that


and that is bounded, where is the sequence obtained by Algorithm 1, i.e. it exists such that


Then, the sequence converges to the set of critical points of , cf. (15), i.e.


as .


Without loss of generality, due to the assumptions (a)–(c), we may suppose that, for all , the while loop in Algorithm 1 terminates after finitely many steps, cf. Remark 3.1; in particular, the sequence is well-defined. For any , let , where is again the final approximation on the Galerkin space . We proceed in two steps.

Step 1. We first show that in for . Assume to the contrary that there exists and a subsequence such that for all subindices . Then, by definition of the operator norm, for all , there is with and . Moreover, due to the approximation condition (20), for any , there exists an index such that

In particular, for large enough, we can find a discrete element such that

Invoking the triangle inequality, since , this leads to


From the definition of and the boundedness (21), we infer that for all . Moreover, by modus operandi of the algorithm, we have

We specify , and select sufficiently large such that . For these choices, the right-hand side in (23) is bounded by , which yields the desired contradiction.

Step 2. We are now prepared to show that , cf. (22). As before, we apply a contradiction argument. To this end, suppose that there exists and a subsequence with for all . From the first step we know that in for , and is bounded by assumption (c) and because the sequence is decreasing, cf. (17). Then, since satisfies the Palais-Smale compactness condition (PS), this yields the existence of a convergent subsequence in . By continuity of , it holds that in , i.e.  is a critical point of . This contradicts the assumption on the sequence . ∎

Corollary 3.3.

Given the assumptions of Theorem 3.2. Then, the following hold true:

  1. The sequence has a convergent subsequence, and any convergent subsequence tends to some critical point of in .

  2. In particular, if there is a constant such that for all , and , , for , then any such limit point is a new critical point of , i.e. for all .


The convergence property (a) follows immediately from the proof of Theorem 3.2. For (b) we assume to the contrary that there is subsequence limit with for some . Then, for large enough, it holds that

and . Hence, applying the triangle inequality leads to

Recalling that uniformly, this contradicts assumption (b) of Theorem 3.2. ∎

4. Application to semilinear elliptic PDE

We apply the adaptive LMMG Algorithm 1 in the context of finite element discretizations of semilinear elliptic Dirichlet boundary value problems of the form


here, we assume that is a bounded open domain with sufficiently smooth boundary , and is a singular perturbation parameter. Furthermore, satisfies the following condition: there are two constants and , which do not depend on , such that


cp. [32, §4.4, (A3)]. Moreover, the right-hand side function satisfies the following standard conditions (see, e.g., [30, p. 9]):

  • ;

  • there are constants such that

    where if , and

    where as , if .

In the one-dimensional case, , we note that (f2) can be dropped.

4.1. Existence of weak solutions

Under the above conditions it can be shown that critical points of the functional , defined by




the anti-derivative of , are weak solutions of (24a)–(24b) in the standard Sobolev space ; see, e.g., [30, Prop. B.10.]. Moreover, the functional is well-defined, and an elementary calculation reveals that


For our purpose, we endow the space with the inner product defined by


and the induced norm


where is the constant from (25). We note that this norm is equivalent to the standard -norm (with equivalence constants depending on  and ); in particular the space equipped with the above inner product is a Hilbert space.

If we state some additional conditions on , then the functional from (26) satisfies the Palais-Smale compactness condition (PS) on . Specifically, we assume that

  1. as , and

  2. there are constants and such that

We remark that integrating (f4) yields the existence of constants such that


cf. [30, Rem. 2.13]. If satisfies (f1)–(f4), then from (26) does indeed fulfil the PS-condition; we refer to [30, p. 11] for a detailed analysis. Moreover, invoking the mountain pass theorem, Theorem 1.1, these assumptions yield the existence of a nontrivial weak solution to (24a)–(24b); see [30, Thm. 2.15]. Furthermore, we may obtain a nontrivial classical solution, provided that (f1) is replaced by the following stronger condition:

  1. is locally Lipschitz continuous in .

Within this setting, i.e. assuming (f1’) and (f2), it follows that any weak solution of (24a)–(24b) is in fact classical; see, e.g. [1]. Moreover, supposing that (f1’) and (f2)–(f4) hold true, then there exists a positive and negative classical solution; we refer to [30, Cor. 2.23].

4.2. Galerkin discretization

We consider a sequence of hierarchically enriched conforming finite-dimensional subspaces . In the specific context of the semilinear boundary value problem (24) the steepest descent direction from (18) with respect to the inner product defined in (29) satisfies the following linear Galerkin formulation: Given , find such that


for all . This is the key part in the LMMG Algorithm 1 (in addition to the optimization process associated to the peak selection). It underlines that the discrete solution of (24) splits into a sequence of linear discrete problems which is obtained iteratively on each of the Galerkin spaces . Incidentally, from (32) we immediately see that if and only if

i.e. if and only if is the Galerkin solution of (24) in .

4.3. Convergence of the adaptive LMMG algorithm in the context of semilinear elliptic PDE

We aim to solve the semilinear elliptic problem (24a)–(24b) by applying the adaptive LMMG Algorithm 1 to the functional from (26). In order for the assumptions of Theorem 3.2 to hold, we introduce two additional properties of :

  1. For any given , the function

    is strictly increasing in ,

  2. For any , the function is continuously differentiable with respect to .

We focus on the case . Then, and . Recall that the energy functional satisfies the Palais-Smale condition (PS) provided that features the properties (f1)–(f4). It thus remains to verify the conditions (a)–(c) in Theorem 3.2, and the boundedness of in , cf. (21). The following proposition is a summary of the results in [24, §4]; its proof is presented in Appendix A.

Proposition 4.1.

Let be a peak selection of from (26) with respect to . If satisfies (f1)–(f6), then

  1. the peak selection is uniquely defined and continuous,

  2. for some ,

  3. for some and for all ,

  4. for some and for all .

In particular, the assumptions of Proposition 2.2 are satisfied.

If, in addition to the assumptions in the above Proposition 4.1, we suppose that (20) holds true, then the prerequisites of Theorem 3.2 are satisfied as well. This yields the ensuing result.

Theorem 4.2.

Let be the energy functional from (26) with satisfying (f1)–(f6), and let denote the unique peak selection of with respect to . If the sequence of Galerkin spaces generated by the adaptive LMMG Algorithm 1 satisfies (20), then the resulting sequence converges to the set of critical points of in in the sense of (22).

Remark 4.3.

In the context of standard finite element methods, we emphasize that (20) is satisfied whenever the mesh size tends to zero.

Remark 4.4.

For the special case as in the above Theorem 4.2, the application of the peak selection, cf. Lines 9 and 14 of the LMMG Algorithm 1, amounts to a one-dimensional optimization problem. More precisely, we need to minimize the mapping on , for . Applying differentiation, this can be expressed in terms of a scalar nonlinear equation, viz.

cf. (35) in Appendix A. This yields a unique minimizer , and, thereby, the evaluation of the peak selection . Equivalently, in the singularly perturbed case , a numerically more stable approach is to first compute the unique minimizer of the scaled mapping on , and then to determine .

5. Numerical experiments

The aim of this section is to test our adaptive LMMG Algorithm 1 in the context of the singularly perturbed semilinear elliptic boundary value problem (24). This requires to solve (32) on a suitable family of Galerkin spaces. In the sequel, standard low-order finite element discretizations will be applied.

5.1. Finite element discretization

We consider regular and shape-regular meshes that partition the domain