Kohn-Sham density functional theory (DFT) is the most widely used model in electronic structure calculations . We see that to solve the Kohn-Sham equation, which is a nonlinear eigenvalue problem, some self-consistent field (SCF) iterations are demanded [7, 9, 22]. However, the convergence of SCF iterations is not guaranteed, especially for large scale systems with small band gaps, for which the performance of the SCF iterations is unpredictable [8, 32]. It has been shown by numerical experiments that the SCF iterations usually converge for systems with larger gap between the occupied orbitals and the remainder . We understand that there are a number of works trying to illustrate this phenomenon and see that SCF iterations do converge if the gap is uniformly large enough locally or globally [4, 16, 17, 31].
In order to obtain approximations of the Kohn-Sham DFT that are convergent, in recent two decades, people pay much attention to study the direct energy minimization model. Instead of solving the Kohn-Sham equation, people minimize the Kohn-Sham total energy under an orthogonality constraint [8, 12, 16, 17, 24, 25, 26, 27, 28, 32, 33]. It is shown in  that a monotonic optimization approach may produce a locally convergent approximations. We observe that the iterations based on the optimization should be carefully carried out due to the orthogonality constraint, for which the existing methods are indeed either retraction (see, e.g., [8, 32]) or manifold path optimization approaches (see, e.g., [8, 28, 32]). We see that some backtracking should be applied in a monotonic optimization method, due to not only theory but also practice.
In this paper, we introduce and analyze a gradient flow based discretized Kohn-Sham DFT for electronic structure calculations. First, we prove that our gradient flow based discretized Kohn-Sham DFT preserves orthogonality and models the ground state well. We then propose a midpoint scheme to carry out the temporal discretization, which is of orthogonality preserving, too. We mention that our numerical scheme avoids a retraction process and does not need any backtracking. Based on the midpoint scheme, finally, we design and analyze an orthogonality preserving iteration scheme for solving the discretized Kohn-Sham energy. It is shown by theory and numerics that our orthogonality preserving iteration scheme is convergent provided some reasonable assumption.
For illustration, we provide Figure 1 to show the differences among the three approaches. In the midpoint scheme of the gradient flow based model (blue dashed line with square symbol endpoint), the auxiliary point of midpoint scheme of the dynamical system is inside the manifold. In the manifold path method (black solid line with circle symbol endpoint), the path is on the manifold and the energy is decreasing when the iteration is moving along the path. In the retraction method (red solid line with triangle symbol endpoint), the auxiliary point is in the tangent space and outside the manifold.
for the ground state of Bose-Einstein condensate (which requires the smallest eigenvalue and its associated eigenfunction only). We point out that our gradient flow based model is different from the gradient flow model proposed in for the Kohn-Sham DFT, in which the numerical scheme is either the retraction approach or the manifold path approach.
We organize the rest of the paper as follows. In section 2, we introduce some necessary notation and the Kohn-Sham DFT models. Then we come up with our gradient flow based model and prove its local convergence and convergence rate of the asymptotic behaviours in section 3. In section 4, we propose a midpoint scheme to realize temporal discretization of the gradient flow based model and investigate the relevant properties including preserving orthogonality automatically, updating inside the manifold as well as the local convergence. Based on the midpoint scheme, in section 5, we design and analyze an orthogonality preserving iteration scheme for solving the discretized Kohn-Sham energy. In section 6, we provide numerical experiments that support our theory. Finally, we present some concluding remarks.
In this section, we introduce some basic notation and the Kohn-Sham models.
2.1 Basic notation
We apply the standard -inner product , which is defined as
denote -norm by , and -norm by
We define -norm as
and use Sobolev space
Let and . Define product matrix
and inner product matrix
For , we set
We then introduce the Stiefel manifold defined as
For and any matrix , we denote
We see that
We define an equivalent relation “” on as
and get a Grassmann manifold, which is a quotient of
We introduce an equivalent class of by
an inner product as
together with an associated norm
Give a finite-dimensional space spanned by . We denote . We see that for any , there exists such that
We define a closed -neighborhood of by
and for introduce a closed -neighborhood of on by
For simplicity, we use notation
where and denote operators on :
for any .
2.2 Kohn-Sham models
The energy based Kohn-Sham DFT model for a system of electron orbitals with external potential contributed by M nuclei of charges is the following constrained optimization problem on the Stiefel manifold
where is the Kohn-Sham energy
Here are Kohn-Sham orbitals,
is the associated electron density with being the occupation number of the -th orbital and . is the external potential generated by the nuclei: for full potential calculations,
and are the nuclei charge and position of the -th nuclei respectively; while for pseudo potential approximations, the formula for the energy is still (8) (see, e.g., ). The fourth term in (8) is the exchange-correlation energy, to which some approximations, such as LDA(Local Density Approximation), GGA(General Gradient Approximation) and so on[18, 20], should be applied. We assume that is bounded from below with orthogonality constraint of , which is of physics. For simplicity, we consider the case of .
We see that for any and all , there hold
Instead we consider an optimization problem on
and define level set
To introduce the gradient on , we suppose
and assume that the exchange-correlation energy is differentiable and the exchange-correlation potential
We may write the gradient of as
where is defined by
We see from  that the gradient on Grassmann manifold of at is
To propose a gradient flow based model preserving orthogonality, we need to extend the domain of from to . We then define extended gradient as follows
If , then we may view in the sense of isomorphism and
As a result, and we may write
3 Gradient flow based model
In this section, we propose and analyze a gradient flow based model.
3.1 The model
Different from the Kohn-Sham equation and the Kohn-Sham energy minimization model, we propose a gradient flow based model of Kohn-Sham DFT as follows:
where and . We see that (20) is different from the standard gradient flow model presented in , which applies the in (14) rather than (19). We point out that whether the solution of (14) keeps on the Stiefel manifold is unclear. However, we see from Proposition 3.1 that our new defined by (19) guarantees that the solution keeps on the Stiefel manifold. Namely, (20) is an orthogonality preserving model whenever the initial is orthogonal.
We see that
The solution of (20) satisfies . Moreover, there holds
A direct calculation shows that
due to . Consequently, we see from Lemma 3.1 that
As a result,
3.2 Critical points
We denote Lagrange function of (11)
for and , then the corresponding first-order necessary condition is as follows
We call a critical point of (11) if
Obviously, for such a critical point, we have
Thus we see that such a critical point may be a local minimizer.
As , we know that energy decreases monotonically, thus exists provided that is bounded from below. The following statement tells us the asymptotical behavior of the extended gradient flow(c.f. ). If is a solution of (20), then
We see from Proposition 3.1 that
Since is nonnegative function, we have
Suppose that the local minimizer is the unique critical point of (11) in . For a fixed constant , we define
Here and hereafter, we assume that as an operator from to , is continuous in .
If the initial value satisfies
We obtain from Theorem 3.2 that there exists a sequence so that and . The uniqueness of critical point in implies . Due to we have
where is the level set. Since set
is compact, there exist a subsequence and that . Since is continuous, then is also continuous, so . By the uniqueness of critical point in again, we get and
We claim that . Otherwise, there exists a subsequence that for some fixed , . Since is compact, there exist a subsequence and that . Therefore
and , which contradicts the assumption .
Clearly, there exists that
Indeed, we may have some convergence rate. If and
for some and , then there exists such that
for all .
We see that
which together with Lemma 3.1 leads to
Note that Theorem 3.2 implies that there exists such that
Hence, we obtain from (29) that
Using Grönwall’s inequality we arrive at
Therefore, for all , there hold
4 Temporal discretization
We may apply various temporal discretization approaches to solve (20). In this section, we propose and analyze a midpoint point scheme. Our analysis shows that the midpoint point scheme is quite efficient and recommended.
4.1 A midpoint scheme
Let be discrete points such that
and . Set
and consider a midpoint scheme as follows
where . Equivalently
Our midpoint scheme is an implicit method and we will propose and analyze a practical scheme to solve (38) in the next section.
First, we investigate the existence of the solution of (38) in a neighborhood of , which requires that is Lipschitz continuous locally
which is true for LDA when . However, it is still open whether . There exist such and a unique function which satisfies
for some and . We define on by
Obviously, and exists. Since
we see from implicit function theory that there exists a unique function which satisfies for some . Thus we complete the proof.