Gradient Flow Based Discretized Kohn-Sham Density Functional Theory

07/15/2019 ∙ by Xiaoying Dai, et al. ∙ Chinese Academy of Science 0

In this paper, we propose and analyze a gradient flow based Kohn-Sham density functional theory. First, we prove that the critical point of the gradient flow based model can be a local minimizer of the Kohn-Sham total energy. Then we apply a midpoint scheme to carry out the temporal discretization. It is shown that the critical point of the Kohn-Sham energy can be well-approximated by the scheme. In particular, based on the midpoint scheme, we design an orthogonality preserving iteration scheme to minimize the Kohn-Sham energy and show that the orthogonality preserving iteration scheme produces approximations that are orthogonal and convergent to a local minimizer under reasonable assumptions. Finally, we report numerical experiments that support our theory.



There are no comments yet.


page 31

page 32

page 34

page 35

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Kohn-Sham density functional theory (DFT) is the most widely used model in electronic structure calculations [19]. We see that to solve the Kohn-Sham equation, which is a nonlinear eigenvalue problem, some self-consistent field (SCF) iterations are demanded [7, 9, 22]. However, the convergence of SCF iterations is not guaranteed, especially for large scale systems with small band gaps, for which the performance of the SCF iterations is unpredictable [8, 32]. It has been shown by numerical experiments that the SCF iterations usually converge for systems with larger gap between the occupied orbitals and the remainder [31]. We understand that there are a number of works trying to illustrate this phenomenon and see that SCF iterations do converge if the gap is uniformly large enough locally or globally [4, 16, 17, 31].

In order to obtain approximations of the Kohn-Sham DFT that are convergent, in recent two decades, people pay much attention to study the direct energy minimization model. Instead of solving the Kohn-Sham equation, people minimize the Kohn-Sham total energy under an orthogonality constraint [8, 12, 16, 17, 24, 25, 26, 27, 28, 32, 33]. It is shown in [23] that a monotonic optimization approach may produce a locally convergent approximations. We observe that the iterations based on the optimization should be carefully carried out due to the orthogonality constraint, for which the existing methods are indeed either retraction (see, e.g., [8, 32]) or manifold path optimization approaches (see, e.g., [8, 28, 32]). We see that some backtracking should be applied in a monotonic optimization method, due to not only theory but also practice.

In this paper, we introduce and analyze a gradient flow based discretized Kohn-Sham DFT for electronic structure calculations. First, we prove that our gradient flow based discretized Kohn-Sham DFT preserves orthogonality and models the ground state well. We then propose a midpoint scheme to carry out the temporal discretization, which is of orthogonality preserving, too. We mention that our numerical scheme avoids a retraction process and does not need any backtracking. Based on the midpoint scheme, finally, we design and analyze an orthogonality preserving iteration scheme for solving the discretized Kohn-Sham energy. It is shown by theory and numerics that our orthogonality preserving iteration scheme is convergent provided some reasonable assumption.

For illustration, we provide Figure 1 to show the differences among the three approaches. In the midpoint scheme of the gradient flow based model (blue dashed line with square symbol endpoint), the auxiliary point of midpoint scheme of the dynamical system is inside the manifold. In the manifold path method (black solid line with circle symbol endpoint), the path is on the manifold and the energy is decreasing when the iteration is moving along the path. In the retraction method (red solid line with triangle symbol endpoint), the auxiliary point is in the tangent space and outside the manifold.

Gradient Flow
Based Model
Yes(retraction method),
no(manifold path method)
Outside manifold
(retraction method),
on manifold
(manifold path method)
Yes Yes May not
Eigenvalue gap
large enough
(depending on
Table 1: Comparison of three models for Kohn-Sham DFT
Figure 1: Comparison among gradient flow scheme, manifold path method and retraction method: blue dashed line with square symbol endpoint – midpoint scheme of gradient flow model, black solid line with circle symbol endpoint – manifold path method, red solid line with triangle symbol endpoint– retraction method. Diamond symbol – auxiliary point of each method.

We observe that there are some existing works on the gradient flow methods of eigenvalue problems. We refer to [6, 14, 29] and references cited therein for linear eigenvalue problems and [2]

for the ground state of Bose-Einstein condensate (which requires the smallest eigenvalue and its associated eigenfunction only). We point out that our gradient flow based model is different from the gradient flow model proposed in

[23] for the Kohn-Sham DFT, in which the numerical scheme is either the retraction approach or the manifold path approach.

We organize the rest of the paper as follows. In section 2, we introduce some necessary notation and the Kohn-Sham DFT models. Then we come up with our gradient flow based model and prove its local convergence and convergence rate of the asymptotic behaviours in section 3. In section 4, we propose a midpoint scheme to realize temporal discretization of the gradient flow based model and investigate the relevant properties including preserving orthogonality automatically, updating inside the manifold as well as the local convergence. Based on the midpoint scheme, in section 5, we design and analyze an orthogonality preserving iteration scheme for solving the discretized Kohn-Sham energy. In section 6, we provide numerical experiments that support our theory. Finally, we present some concluding remarks.

2 Preliminaries

In this section, we introduce some basic notation and the Kohn-Sham models.

2.1 Basic notation

We apply the standard -inner product , which is defined as


denote -norm by , and -norm by


We define -norm as

and use Sobolev space

Let    and    . Define product matrix

and inner product matrix

For , we set


We then introduce the Stiefel manifold defined as

For and any matrix , we denote

We see that


We define an equivalent relation “” on as

and get a Grassmann manifold, which is a quotient of

We introduce an equivalent class of by

an inner product as

together with an associated norm

on .

Give a finite-dimensional space spanned by . We denote . We see that for any , there exists such that


We define a closed -neighborhood of by


and for introduce a closed -neighborhood of on by


For simplicity, we use notation


where and denote operators on :

for any .




is skew-symmetric.

2.2 Kohn-Sham models

The energy based Kohn-Sham DFT model for a system of electron orbitals with external potential contributed by M nuclei of charges is the following constrained optimization problem on the Stiefel manifold


where is the Kohn-Sham energy


Here are Kohn-Sham orbitals,


is the associated electron density with being the occupation number of the -th orbital and . is the external potential generated by the nuclei: for full potential calculations,

and are the nuclei charge and position of the -th nuclei respectively; while for pseudo potential approximations, the formula for the energy is still (8) (see, e.g., [8]). The fourth term in (8) is the exchange-correlation energy, to which some approximations, such as LDA(Local Density Approximation), GGA(General Gradient Approximation) and so on[18, 20], should be applied. We assume that is bounded from below with orthogonality constraint of , which is of physics. For simplicity, we consider the case of .

We see that for any and all , there hold



Instead we consider an optimization problem on


and define level set

To introduce the gradient on , we suppose

and assume that the exchange-correlation energy is differentiable and the exchange-correlation potential

We may write the gradient of as

where is defined by




We see from [10] that the gradient on Grassmann manifold of at is


To propose a gradient flow based model preserving orthogonality, we need to extend the domain of from to . We then define extended gradient as follows


Note that (15) is consistent with (14) for since .

We see from [8, 10] that the tangent space on is


and the Hessian of on is


If , then we may view in the sense of isomorphism and


As a result, and we may write



3 Gradient flow based model

In this section, we propose and analyze a gradient flow based model.

3.1 The model

Different from the Kohn-Sham equation and the Kohn-Sham energy minimization model, we propose a gradient flow based model of Kohn-Sham DFT as follows:


where and . We see that (20) is different from the standard gradient flow model presented in [23], which applies the in (14) rather than (19). We point out that whether the solution of (14) keeps on the Stiefel manifold is unclear. However, we see from Proposition 3.1 that our new defined by (19) guarantees that the solution keeps on the Stiefel manifold. Namely, (20) is an orthogonality preserving model whenever the initial is orthogonal.

If and


We see that

which indicates


The solution of (20) satisfies . Moreover, there holds


A direct calculation shows that

which indicates

due to . Consequently, we see from Lemma 3.1 that


As a result,


3.2 Critical points

We denote Lagrange function of (11)


for and , then the corresponding first-order necessary condition is as follows


We call a critical point of (11) if

Obviously, for such a critical point, we have

which suggests

Thus we see that such a critical point may be a local minimizer.

As , we know that energy decreases monotonically, thus exists provided that is bounded from below. The following statement tells us the asymptotical behavior of the extended gradient flow(c.f. [13]). If is a solution of (20), then


We see from Proposition 3.1 that

Since is nonnegative function, we have

Suppose that the local minimizer is the unique critical point of (11) in . For a fixed constant , we define

Here and hereafter, we assume that as an operator from to , is continuous in .

If the initial value satisfies


We obtain from Theorem 3.2 that there exists a sequence so that and . The uniqueness of critical point in implies . Due to we have

where is the level set. Since set

is compact, there exist a subsequence and that . Since is continuous, then is also continuous, so . By the uniqueness of critical point in again, we get and

We claim that . Otherwise, there exists a subsequence that for some fixed , . Since is compact, there exist a subsequence and that . Therefore

and , which contradicts the assumption .

Clearly, there exists that



Indeed, we may have some convergence rate. If and


for some and , then there exists such that

for all .

We see that

which together with Lemma 3.1 leads to

Note that Theorem 3.2 implies that there exists such that


Hence, we obtain from (29) that


Using Grönwall’s inequality we arrive at


Therefore, for all , there hold




We understand that (29) has been already applied in [8, 23]. We observe that in (29) is related to the gap between the -th eigenvalue and the -th eigenvalue of the Kohn-Sham equation.

4 Temporal discretization

We may apply various temporal discretization approaches to solve (20). In this section, we propose and analyze a midpoint point scheme. Our analysis shows that the midpoint point scheme is quite efficient and recommended.

4.1 A midpoint scheme

Let be discrete points such that


and . Set


and consider a midpoint scheme as follows


where . Equivalently


Our midpoint scheme is an implicit method and we will propose and analyze a practical scheme to solve (38) in the next section.

First, we investigate the existence of the solution of (38) in a neighborhood of , which requires that is Lipschitz continuous locally

which is true for LDA when . However, it is still open whether [11]. There exist such and a unique function which satisfies


for some and . We define on by


Obviously, and exists. Since


we see from implicit function theory that there exists a unique function which satisfies for some . Thus we complete the proof.

Due to Lemma 4.1, we see that is the solution of (37). Then we arrive at the following Algorithm 1 and refer to Theorem 4.2 for the choice of .

1 Given , , initial data , calculate gradient , let ;
2 while <