# A posteriori error estimation for the non-self-consistent Kohn-Sham equations

We address the problem of bounding rigorously the errors in the numerical solution of the Kohn-Sham equations due to (i) the finiteness of the basis set, (ii) the convergence thresholds in iterative procedures, (iii) the propagation of rounding errors in floating-point arithmetic. In this contribution, we compute fully-guaranteed bounds on the solution of the non-self-consistent equations in the pseudopotential approximation in a plane-wave basis set. We demonstrate our methodology by providing band structure diagrams of silicon annotated with error bars indicating the combined error.

## Authors

• 1 publication
• 2 publications
• 3 publications
• ### Correct Approximation of IEEE 754 Floating-Point Arithmetic for Program Verification

Verification of programs using floating-point arithmetic is challenging ...
03/11/2019 ∙ by Roberto Bagnara, et al. ∙ 0

• ### Exact arithmetic as a tool for convergence assessment of the IRM-CG method

Using exact computer arithmetic, it is possible to determine the (exact)...
06/14/2019 ∙ by J. Dvornik, et al. ∙ 0

• ### On the relation of truncation and approximation errors for the set of solutions obtained by different numerical methods

The truncation and approximation errors for the set of numerical solutio...
05/12/2020 ∙ by A. K. Alekseev, et al. ∙ 0

• ### Stochastic rounding and reduced-precision fixed-point arithmetic for solving neural ordinary differential equations

Although double-precision floating-point arithmetic currently dominates ...
04/25/2019 ∙ by Michael Hopkins, et al. ∙ 0

• ### Numerical verification of the microscopic time reversibility of Newton's equations of motion: Fighting exponential divergence

Numerical solutions to Newton's equations of motion for chaotic self gra...
02/03/2018 ∙ by Simon Portegies Zwart, et al. ∙ 0

• ### Sound Approximation of Programs with Elementary Functions

Elementary function calls are a common feature in numerical programs. Wh...
11/26/2018 ∙ by Eva Darulova, et al. ∙ 0

• ### FPDetect: Efficient Reasoning About Stencil Programs Using Selective Direct Evaluation

We present FPDetect, a low overhead approach for detecting logical error...
04/09/2020 ∙ by Arnab Das, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Experimental results are provided with error bars in most scientific fields. In an ideal world, this should also be the case for results obtained by numerical simulation. Complementing simulation results with error bars is becoming mandatory in some branches of engineering, such as aeronautics or car industry, in which simulation has partially, or even totally, replaced experiment (e.g. virtual wind tunnels Mo et al. (2013) or crash simulators Spethmann et al. (2009)). Obviously, uncontrolled errors in the numerical simulations used to design and test an aircraft are likely to have dramatic consequences for passengers and crews.

error bars are often displayed in molecular dynamics (MD) and quantum Monte Carlo (QMC), where stochastic processes (e.g. Langevin equation for MD, drift-diffusion stochastic differential equations for QMC) are at the core of the simulation. However, these error bars are purely statistical in nature and are not guaranteed: they only reflect the finiteness of the statistical sample and the actual solution of the model has a positive probability to lay outside the confidence interval. In addition, they only take into account one of the components of the error between the exact value of the quantity of interest (q.o.i.) and its numerical approximation. The other components of the error are deterministic in nature and are also present in simulations of purely deterministic models, such as those based on density-functional theory (DFT) which are dealt with in the article.

Consider as an example of q.o.i. the lattice constant of Silicon at 0 K. In order to compute a numerical approximation of this q.o.i., we first have to select a model. We have at our disposal an outstanding model: the many-body Schrödinger equation with relativistic corrections. However, solving this equation directly is completely out of reach. The first approximation is to replace this reference model with a cruder, but tractable, reduced model. We consider as a reduced model the one obtained by successively using the Born-Oppenheimer approximation to decouple as much as possible nuclear and electronic degrees of freedom, and the Local Density Approximation of the Kohn-Sham density-functional theory (KS-LDA), together with a pseudopotential model to avoid representing core electrons and the singularities of the orbitals near atoms. This gives rise to a well-defined mathematical model, hopefully with a unique solution. We refer to the difference between this approximation and the physical reality as the

model error.

The reduced model, though simpler than the reference model, still has an infinite number of degrees of freedom, and must be discretized to be simulated. For the crystalline phase, a typical (simplified) workflow is as follows Martin and Martin (2004):

1. The infinite computational domain is truncated to a finite supercell with periodic boundary conditions, numerically handled through Brillouin zone sampling;

2. In this supercell, crystalline orbitals are discretized using a finite basis set of plane waves;

3. The self-consistent Kohn-Sham equations are solved by a self-consistent field (SCF) algorithm;

4. At each step of the SCF algorithm, a linear eigenvalue equation is solved with an iterative eigensolver;

5. All computations are performed with finite precision.

We refer to the error in steps 1 and 2 as the discretization error, to the error in steps 3 and 4 as the algorithmic error, and to the error in step 5 as the arithmetic error. To these must be added programming errors, not to be neglected in codebases consisting of millions of lines of code, and hardware errors, which are expected to become significant for exascale architectures Snir et al. (2014). These two latter kinds of errors we will not treat.

Note that all steps mentioned above are systematically improvable: by increasing parameters, such as Brillouin zone sampling, basis set cutoff, convergence thresholds, or using higher-precision arithmetic, results get more accurate at the price of longer computation times. The usual procedure for controlling these errors is to perform convergence studies: on the system of interest or a related system with similar characteristics, the parameter is increased until the variation of the quantity of interest is below a specified tolerance. Over time, knowledge of “good” parameter values solidifies into rules of thumb that are automated in codes or suggested to users in manuals. When used appropriately, this results in acceptable errors that are below the target accuracy (for instance, for pseudopotential methods, the error compared to all-electron models) Lejaeghere et al. (2014, 2016).

This empirical process is however still problematic. First, it might result in suboptimal performance by excess of caution. Second, it still requires a degree of hand-tuning and is thus problematic for fully automated computations, which are useful for in silico design of novel compounds and for building databases of material or chemical properties. Third, the rules of thumb can fail for unusual systems with unexpected behavior, i.e. exactly those where accurate simulations are important to aid understanding.

The purpose of a posteriori error analysis is to automate and rationalize this process by providing accurate, computable and guaranteed bounds on the error of each step. The purpose of this is twofold. First, bounds of the total error on the q.o.i. can be obtained by simply summing up the different components of the error, allowing one to bound the accuracy of the final result. Second, the computer resources necessary to reach a given accuracy on the final result can be optimized by error balancing techniques. For instance, convergence thresholds should be adapted to the discretization: if the discretization is coarse, it is not necessary to use extremely tight convergence criteria, since the final accuracy will be limited by the discretization error anyway. For the same reason, it might suffice to perform most operations in single-precision arithmetic to save computational time without losing much on the accuracy of the final result. Error balancing based on a posteriori error bounds allows one to turn these common sense remarks into black-box numerical strategies. The user provides the target accuracy and the software automatically chooses, in an adaptive way along the iterative process, the reduced models, discretization bases, convergence thresholds, and data structures to obtain the desired accuracy at a quasi-optimal computational cost.

Applying this methodology to electronic structure calculations is a challenge due to the complexity of the equations. Despite this, significant progress has been made in the past decade towards rigorous error control, which makes this perspective realistic in the medium term. Let us mention in particular recent works on a priori and a posteriori discretization error bounds for DFT Cancès et al. (2012); Chen et al. (2013); Chen and Schneider (2015a, b); Kaye et al. (2015); Cancès et al. (2016, ressa), including -point sampling Cancès et al. (ressb), and on the numerical analysis of SCF algorithms Chupin et al. (2020); Dai et al. (2017); Rohwedder and Schneider (2011); Zhao et al. (2015).

In this contribution, we provide for the first time a combined analysis of the errors in step 2 (basis set truncation), 4 (inexact solution of the eigenvalue problem) and 5 (arithmetic error). The q.o.i. is the band diagram of silicon, or more precisely the energies of the 3 highest occupied and 4 lowest unoccupied crystalline orbitals for specific -points. The models under consideration are the Cohen-Bergstresser model Cohen and Bergstresser (1966) on the one-hand, and the non-self-consistent (electron-electron interaction completely neglected) periodic KS-LDA model with GTH pseudopotentials Goedecker et al. (1996); Hartwigsen et al. (1998) on the other hand, for which we will give fully guaranteed error bounds.

Our analysis is general and can be extended to other quantities of interest and models. It is, however, limited in its present state due to the neglect of the errors in steps 1 (Brillouin zone sampling) and 3 (self-consistency). We hope to address both these challenging aspects in future publications.

Let us finally point out that the model error arising from the choice of density-functional theory is not easily systematically improvable. For wavefunctions methods (e.g. coupled-cluster methods Rohwedder and Schneider (2013)) guaranteed a posteriori

error bounds on the model error — with respect to the many-body Schrödinger equation (MBSE) — can in principle be derived, since a residual of the MBSE can be computed from the approximate wavefunction and energy. For DFT, on the other hand, this is not easily the case such that guaranteed model errors are probably out of reach and going beyond machine-learned confidence intervals obtained from big data sets of reference calculations seems very difficult. Regarding pseudopotentials and related approaches, the Projector Augmented Wave

Blöchl (1994) (PAW) method should be amenable to the derivation of guaranteed error bounds - with respect to reference full-electron DFT calculations - since approximate all-electron Kohn-Sham orbitals can be reconstructed from PAW pseudo-orbitals allowing the construction of a residual.

## Ii Kohn-Sham density-functional theory in the pseudopotential approximation

As already mentioned in the introduction, the q.o.i. under consideration are the energies of crystalline orbitals at specific -points, in a non-self-consistent pseudopotential model. In the following we denote by the periodic lattice, and assume that the Bloch wavevector is fixed in the Brillouin zone. The equation we solve is

 Hu=εu,∫Ω|u(r)|2dr=1, (1)

with an -periodic function and the unit cell. The periodic Hamiltonian is given by

 H=12(−i∇+k)2+V,

where the (possibly nonlocal) effective potential is assumed to be known.

The problem (1) is discretized in a Fourier basis: any -periodic function can be expanded on the orthonormal plane-wave basis

 eG(r)=1√|Ω|eiG⋅r

where is the volume of the unit cell, and belongs to the reciprocal lattice . The complete set is truncated to obtain a finite approximation space

 X=Span{eG,12|G+k|2≤Ecut}

with dimension , for some finite cutoff energy . We will use the convenient abuse of notation consisting in writing to denote a such that . The linear eigenvalue equation (1) is then discretized by invoking the variational principle: the complex Hermitian matrix

 ⟨eG|H|eG′⟩=12|G+k|2δGG′+⟨eG|V|eG′⟩,G,G′∈X,

is diagonalized using an iterative eigensolver Saad (2011); Kresse and Furthmüller (1996)

, employing fast Fourier transform to perform matrix-vector products efficiently.

The potential can be a purely local potential, or have a nonlocal component in the case of norm-conserving pseudopotentials. We will consider two cases here, chosen for their simple analytic forms.

First we consider the Cohen-Bergstresser pseudopotentials for semiconductors in the diamond and zinc-blende structure Cohen and Bergstresser (1966). These are extremely purely local -periodic potentials with a small number of non-zero Fourier coefficients. More precisely

 ⟨eG|V|eG′⟩=1|Ω|ˆvCB(G−G′)(Cohen-Bergstresser), (2)

where is only nonzero for in a small finite set (the first five shells of reciprocal vectors). The coefficients were adjusted to reproduce spectroscopic data.

We next consider more realistic Goedecker–Teter–Hutter (GTH) norm-conserving potentials Goedecker et al. (1996); Hartwigsen et al. (1998). These potentials are composed of a local and a non-local part:

 V=Vloc+Vnl(Goedecker--Teter--Hutter,% GTH),

where

 ⟨eG|Vloc|eG′⟩ =1|Ω|ˆvloc(G−G′), ⟨eG|Vnl|eG′⟩ =∑a∑lm∑ijdalmijpalmi(k+G)¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯palmj(k+G′). (3)

Here runs over all the atoms in the unit cell, the range of angular momentum depends on the chemical element considered, and the projection indices are summed over a small number of integers (two in the case of silicon). The coefficients are finite sums of products of three terms: a structure phase factor depending on the position of the atoms in the unit cell, a radial Gaussian envelope and a radial rational function. The are the product of a structure phase factor, a spherical harmonics , a radial Gaussian envelope and a radial polynomial of degree . Among the many types of pseudopotentials available, we chose this type because their analytic form (rather than tabulated data) lends itself well to rigorous error analysis.

All computations presented in this paper have been performed using the density-functional toolkit (DFTK) Herbst and Levitt , a recent Julia Bezanson et al. (2017) implementation for Kohn-Sham DFT and related methods using plane-wave basis sets. Our implementation of the discussed estimates and the code producing the figures of this paper is available on Github Herbst et al. .

## Iii From residuals to errors

Assume for simplicity that is an isolated eigenvalue of . Given a finite , the result of the iterative procedure is a normalized vector and a Riesz approximation of the eigenvalue , such that the algebraic residual , where is the orthogonal projector on , is small. Note that the algebraic residual vanishes for an exact matrix eigensolver, in contrast with the global residual , which is always nonzero due to the finite basis discretization. The question we are interested in is: can we bound rigorously the error between and ?

Several a posteriori error bounds for elliptic eigenvalue problems are available in the literature Goerisch and He (1990); Babuška and Osborn (1991); Heuveline and Rannacher (2001); Mehrmann and Miedlar (2011); Kuznetsov and Repin (2013); Bank et al. (2013); Carstensen and Gedicke (2014); Liu (2015); Cancès et al. (2018), the most accurate of them requiring lower bounds on the distance between the eigenvalue or cluster of eigenvaluesGallistl (2014, 2015); Dai et al. (2015); Bonito and Demlow (2016); Boffi et al. (2017); Cancès et al. (ressa) of interest and the rest of the spectrum. We focus here on two basic bounds for the sake of simplicity: the implementation of more accurate ones is work in progress.

###### Theorem 1 (Bauer-Fike).

There exists an eigenvalue of such that

 |˜ε−ε|≤∥˜r∥.

This bound is very easy to handle: it only requires an upper bound on the -norm of the residual. It is however not sharp. In particular, it is well-known that the error on isolated eigenvalues of hermitian matrices behaves as the square of the residual. An a posteriori error bound having this property is the following.

###### Theorem 2 (Kato-Temple).

Let be the eigenvalue of closest to . Assume that is an isolated point of the spectrum of with a distance to the rest of the spectrum. Then

 |˜ε−ε|≤∥˜r∥2δ. (4)

The proof of both these theorems can be found in standard textbooks Saad (2011).

To apply these bounds, we need three ingredients: (i) construct an upper bound on the residual , (ii) construct a lower bound on the gap , (iii) implement these bounds in the presence of roundoff errors. We address these problems in sequence.

## Iv Computing the residual

Assuming that the variational numerical method provides an approximate normalized eigenfunction

 ˜u=∑G∈X˜u(G)eG∈X,

with

 ˜u(G):=⟨eG|˜u⟩and∥˜u∥2=∑G∈X|˜u(G)|2=1,

and a Riesz approximation of the associated eigenvalue, then the square norm of the residual

 ˜r=H˜u−˜ε˜u

can be decomposed as

 ∥˜r∥2 =∥PX˜r∥2+∥PX⊥˜r∥2 (5) =∥PX(H˜u−˜ε˜u)∥2+∥PX⊥V˜u∥2

where and are the orthogonal projectors on and respectively. Here we have used that the kinetic energy operator is diagonal in reciprocal space, so that . The first term, which is easily computed in the Fourier basis of , is the square of the norm of the in-space, algebraic residual . It is driven to zero by the iterative eigensolver but does not vanish because the iterations stop when the convergence thresholds are reached. This is the origin of the algorithmic error, and is easily computed explicitly. The second term is the out-of-space residual and is the source of the discretization error. This term is more difficult to compute, and most often in practice only an upper bound can be obtained at a reasonable computational cost.

### iv.1 Cohen-Bergstresser model

In this model, is given by (2), and only a small number of terms are non-zero. In this case

 ∥PX⊥V˜u∥2=1|Ω|2∑G∈X⊥∣∣ ∣∣∑G′∈XˆvCB(G−G′)˜u(G′)∣∣ ∣∣2.

This computation extends over a finite range of . Denoting by the norm of the largest non-zero Fourier mode of , these can be identified to be of the form for and . It follows that belongs to the finite-dimensional space

 Y=Span{eG,G∈R∗,12|k+G|2≤E(2)cut} (6)

with . We therefore extend to this new basis

in this new basis, resulting in an exact computation of the residual. The very quick decay of the residual with increasing is shown in Figure 1 for the first eigenvalue at the point.

### iv.2 Goedecker–Teter–Hutter (GTH) pseudopotentials

With the GTH pseudopotentials, extends on all wave vectors in , and therefore we cannot compute explicitly. Rather, as before, we compute , where the larger approximation subspace , still defined by (6) is determined by a chosen . We then bound the remaining term

 ∥PY⊥V˜u∥2=∑G∈Y⊥∣∣ ∣∣∑G′∈X⟨eG|V|eG′⟩˜u(G′)∣∣ ∣∣2.

We do this by exploiting the fact that the matrix elements connecting small wave vectors in to large wave vectors in are small. We obtain an explicit bound using our knowledge of and the decay properties of Gaussians, see Appendix A for details.

Our rigorous bound of the residual is plotted Figure 2 along with its different components as a function of , for an initial obtained with .

As can be seen clearly, the residual is essentially in as soon as , our estimate, however, is only accurate for larger values of . This is since our bounds for are still rather crude (see Appendix A). Improving these is work in progress.

In theory, one could choose

dynamically based on the size of the estimated residual. In the following, we use the simple heuristic

, deduced from and Figure 2. As can be seen in Figure 3, represented the worst case for our heuristic. With this choice of the component on of the residual is negligible for all values of and our bound is nearly optimal.

## V Estimating the gap

The Kato-Temple bound (4) requires a lower bound on the gap between the (unknown) exact eigenvalue and the rest of the (unknown) spectrum of . Let be the index of the target eigenvalue. Given an operator with eigenvalues , how can we obtain a lower bound on both in the one direction and on in the other? In the rest of this section we focus on a lower bound for , the other one being obtained similarly.

Assume that we have computed the discretization of the Hamiltonian on a plane-wave basis set with finite cutoff energy , and let

be its eigenvalues with corresponding orthonormal eigenvectors

. Note that for the purposes of this section we assume that all computations inside the basis set are exact: we form explicitly the matrix representation of and diagonalize it using a dense eigensolver, and do not consider any arithmetic error.

From the variational principle, . We now need to obtain a rigorous lower bound on , which is is much more complex. Simply taking the difference may lead to an overestimation of the gap, see for example in Figure 4. To compute a proper lower bound on we express the operator on and its orthogonal complement as

 H=(HXXVXX⊥VX⊥XHX⊥X⊥).

In this we have used that the kinetic energy term is diagonal in a plane-wave basis and does not appear in the off-diagonal blocks. We then use the Haynsworth inertia additivity formula Haynsworth (1968): for any not in the spectrum of , the number of eigenvalues of below some threshold is equal to the number of negative eigenvalues of the block plus the number of negative eigenvalues of the Schur complement

 Sμ=(HX⊥X⊥−μ)−VX⊥X(HXX−μ)−1VXX⊥.

If we manage to prove that is positive for some , we therefore obtain that .

Assume that we have computed eigenvectors up until index . Then the Schur complement can be bounded from below by

 Sμ ≥Ecut−∥VX⊥X⊥∥op−μ (7) −∥∥ ∥∥M−1∑n=N+1(VX⊥X˜un,X)(VX⊥X˜un,X)†˜εn,X−μ∥∥ ∥∥op=Bμ−∥VXX⊥∥2op˜εM,X−μ,

where is the operator norm on the space of bounded linear operators on . Computing the term requires knowing the potential on all of the complement , which is not feasible for GTH pseudopotentials. Similarly as with the residuals, we assume that we are able to compute on a superset and additionally able to bound it on . With this in place we split into contributions inside and outside of :

 Bμ≥ −∥∥(VX⊥∩Y,X˜U)(˜Λ−μ)−1(VX⊥∩Y,X˜U)†∥∥op (8) −2∥∥(VX⊥∩Y,X˜U)(˜Λ−μ)−1∥∥op∥∥VX,Y⊥∥∥op% −∥∥VX,Y⊥∥∥2op˜εN,X−μ,

where is the diagonal matrix of eigenvalues and

the orthogonal matrix of corresponding eigenvectors (column-wise).

To compute the first term we use the fact that is a computable, long-and-thin matrix (more rows than columns). We can employ the decomposition to factorize it into the product of an orthogonal matrix and a (small) triangular matrix . The first term in the right-hand side of (8) can then be explicitly computed as the largest eigenvalue of the (small) Hermitian matrix . The second term can be treated similarly. Details on our bounds on the operator norms , and for the Cohen-Bergstresser and the GTH pseudopotential models are given in Appendix A.

For a fixed , is positive for all large enough. For a given and , we find the best lower bound to , denoted by , as the maximum , for which we can ensure that is positive. This is done using a bisection algorithm on our bound of .

Figures 4 and 5 show gaps obtained using this lower bound on for different values of and for the first eigenvalue at the point of the two pseudopotential models we consider. For the simple Cohen-Bergstresser model and using our lower bound for the gap is accurate already at . For the GTH pseudopotentials one needs to use larger values of and .

## Vi Estimating the arithmetic error

To rigorously bound the arithmetic error, we use interval arithmetic, as specified in the IEEE standard 1788-2015 IEE (2018). The main idea of interval arithmetic is to use not one but two floating-point numbers to represent a given quantity, forming an interval which contains the exact number. Every operation is then performed on the limits of the interval utilizing rounding modes of CPUs to ensure that the upper limit is only rounded upwards and the lower limit only rounded downwards if rounding is needed to represent the outcome of the operation. This ensures that the exact answer always lies between the limits of the final interval. This simplistic approach generally overestimates floating-point error, since it considers all floating-point operations independent from each other. The bound obtained in this way is thus too large making an application of interval arithmetic to, e.g. a complete iterative diagonalization algorithm, impractical. Fortunately if we obtain an approximate eigenpair inside a discretization basis using ordinary floating-point arithmetic, we only need to re-evaluate the residual norm in interval arithmetic to obtain a rigorous bound. The upper bound of the resulting interval gives access to the sum of both floating-point error and algorithmic error due to the eigensolver.

In Julia IEEE interval arithmetic is available in the IntervalArithmetic.jl Sanders et al. (2020) package as a custom floating-point type. This type can be directly used with any native Julia code, including GenericLinearAlgebra.jl Noack et al. and FourierTransforms.jl Johnson et al. , which provide interval-arithmetic equivalents to classical linear algebra and FFT algorithms. These packages allowed us to use the routines of DFTK to also bound the arithmetic error for all our double-precision calculations presented in this paper. Example values are shown in Figure 6 indicating that the obtained arithmetic error is several orders of magnitude smaller than the discretization error until around . Increasing the accuracy of the obtained solution to the Cohen-Bergstresser model this point would not only requires to increase the basis, but also to switch to a more accurate floating-point arithmetic beyond IEEE double precision. In a type-generic Julia code like DFTK this is, however, completely seamless and has been used to obtain Figure 7, demonstrating errors below double precision.

## Vii Tracking errors in band computations

With the discussed methodologies we are able to

• compute the norm of the in-space residual norm (leading to the algorithmic error);

• compute a guaranteed upper bound of the out-of-space residual norm (leading to the discretization error);

• compute a guaranteed lower bound of the spectral gap involved in the Kato-Temple inequality (4);

• compute a posteriori errors bounds on the quantities of interest (crystalline orbital energies) using either Bauer-Fike or Kato-Temple inequalities;

• estimate the impact of floating-point errors via interval arithmetic (arithmetic error).

A summary of the overall numerical error on the quantities of interest bound by Bauer-Fike and Kato-Temple versus the “true” error estimated using the largest basis employed ( and respectively) is given in Figures 7 and 8 for the Cohen-Bergstresser and GTH models of silicon. Using our methods we were able to compute band structures for both models with fully guaranteed bounds on the numerical error, see Figures 9 and 10, respectively. Notice that neither Kato-Temple nor Bauer-Fike universally provide the best bound on the error, with the Kato-Temple bound being inapplicable for degenerate eigenvalues in particular. In our plots we alternate between both bounds, only showing the one providing the smallest error. The design and use of error bounds robust to degenerate eigenvalues is left to future work.

## Viii Conclusions and outlook

We have discussed the computation of fully guaranteed bounds to the error in the numerical solution of non-self-consistent Kohn-Sham equations. While many other sources of error exist, our treatment focused exclusively on the discretization error, due to finite basis sets, algorithm error, due to non-zero convergence thresholds, and arithmetic error, due to finite-precision floating-point arithmetic. As quantities of interest we considered the band energies around the Fermi level of the Cohen-Bergstresser and GTH pseudopotential models and demonstrated the feasibility of our approach by providing, for each model, band structure diagrams of silicon, in which the total numerical error was annotated in the form of error bars.

For this two key results have been employed. Firstly we bounded the part of the residual of an eigenvector residing outside the discretization basis by making use of the favorable Fourier-decay properties of the Cohen-Bergstresser and GTH pseudopotentials. Using these we were able to provide computable upper bounds to this out-of-space residual by partially computing the residual explicitly on a slightly larger basis set and bounding the remainder. Secondly, we discussed a method to obtain a guaranteed lower bound on the gaps between eigenvalues, which was based on controlling the number of negative eigenvalues of a shifted Hamiltonian with a Schur complement. This allowed us to rely on the Kato-Temple error bound for isolated eigenvalues.

Our presented approach is general and could be applied to other non-self-consistent models. A drawback of the presented method is, however, that the nonlinear Hartree and exchange-correlation terms of common density-functional-theory methods cannot yet be treated. This limitation we wish to address in future work. Furthermore we hope to use our bounds to facilitate a fully black-box modeling, where accuracy parameters such as the kinetic energy cutoff, the convergence thresholds of the iterative solver or employed floating-point precision are chosen automatically by the code. The hope is to be able to dynamically adjust such accuracy parameters during a simulation to do as little work as necessary to reach the accuracy desired by the user. For such purpose we expect the presented Kato-Temple bounds to be not tight enough, so that better bounds have to be constructed, extending what has been done for the linear setting Cancès et al. (ressa).

## Conflicts of interest

There are no conflicts of interest to declare.

## Acknowledgements

This project has received funding from ISCD (Sorbonne Université) and from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement No 810367).

## Appendix A: Bounds on potentials

In this section we fix the two plane-wave spaces and defined by energy cutoffs and . Our error bounds for the GTH pseudopotentials require explicit bounds on the quantities for a given (to bound the residuals), and on the operators , , and (to obtain bounds on the gap).

We will bound these quantities using our explicit knowledge of the tails of the functions involved: given a non-negative continuous function , we define

 SR∗(f,q)=∑G∈R∗,|G|≥qf(|G|).

Bounds on are obtained in Appendix B.

Note that there is a large freedom in designing bounds, and many possible improvements. The ones we present result from a compromise between accuracy and simplicity.

### Local potential

In the following we assume a single atom centered at the origin; the case of multiple atoms is obtained simply by adding a contribution for each atom. The function is then radial.

To bound for a given , we use the fact that has small components on high wavevectors:

 |Ω|2∥PY⊥Vloc˜u∥2 =∑G∈Y⊥∣∣ ∣∣∑G′∈Xˆvloc(G−G′)˜u(G′)∣∣ ∣∣2 ≤dim(X)∑G∈Y⊥∑G′∈X∣∣ˆvloc(G−G′)˜u(G′)∣∣2 =dim(X)∑G′∈X|˜u(G′)|2∑G∈Y⊥∣∣ˆvloc(G−G′)∣∣2 ≤dim(X)∑G′∈X|˜u(G′)|2∑ΔG∈R∗|ΔG|>√2E(2)cut−|G′||ˆvloc(|ΔG|)|2 =dim(X)∑G′∈X|˜u(G′)|2SR∗(|ˆvloc|2,√2E(2)cut−|G′|)

We bound the operators , and in operator norm simply by the operator norm of . We use the fact that is a multiplication operator in real space by the periodic function

 vloc(r)=1|Ω|∑G∈R∗ˆvloc(G)eiG⋅r

and therefore

 ∥Vloc∥op ≤1Ω⎛⎝∥∥ ∥∥∑G∈Yˆvloc(G)eiG⋅r∥∥ ∥∥∞+∑G∈Y⊥|ˆvloc|⎞⎠ ≤1Ω(∥∥ ∥∥∑G∈Yˆvloc(G)eiG⋅r∥∥ ∥∥∞+SR∗(|ˆvloc|,√2Ecut)),

where . To bound the first term, we use the fact that for a regular grid of the unit cell, we have

 ∥∥ ∥∥∑G∈Yˆvloc(G)eiG⋅r∥∥ ∥∥∞≤maxr∈G∣∣ ∣∣∑G∈Yˆvloc(G)eiG⋅r∣∣ ∣∣+δ∑G∈Y|G||ˆvloc(G)|

where is the diameter of the grid. We compute the first term using a fast Fourier transform and the second explicitly.

### Nonlocal potential

The nonlocal potential is a sum of separable terms Goedecker et al. (1996); Hartwigsen et al. (1998), over atoms, angular momentum , magnetic quantum number and projector channels . We focus on just one of the terms in (3), of the form

 ⟨eG|vnl|eG′⟩=p(1)(k+G)¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯p(2)(k+G′),

where both are of the form (). A bound on the total potential can be obtained naively by the triangular inequality. A better bound can be obtained by exploiting orthogonality between the different quantum numbers and using Pythagoras theorem as well as using Unsöld’s theorem to simplify the sums over . We do not detail these technicalities here.

For a given , let . Then

 ∥PY⊥vnl˜u∥2 =|c˜u|2∑G∈Y⊥|p(1)(k+G)|2 ≤|c˜u|2∥Ylm∥2∞∑G∈Y⊥|R(1)(|G|−|k|)|2 =|c˜u|2∥Ylm∥2∞SR∗(|R(1)(⋅−|k|)|2,√2E(2)cut),

where and we have assumed that is non-increasing on . We compute explicitly and bound the remainder as before. For the operator norm of and , we use the above computation together with the fact that, for any normalized

 |c˜u|2≤∑G∈X|p(2)(k+G)|2.

Finally,

 ∥PX⊥VnlPX⊥∥2op≤∥Ylm∥4∞ SR∗(|R(1)(⋅−|k|)|2,√2Ecut) SR∗(|R(2)(⋅−|k|)|2,√2Ecut).

## Appendix B: Bounds on tail sums

Our bounds involve quantities of the form

 SR∗(f,q):=∑G∈R∗,|G|≥qf(|G|).

where is a known smooth non-negative function non-increasing on the internal for some , and more specifically a Gaussian times a polynomial or rational function. We seek to bound by an explicitly computable quantity for large enough. For a one dimensional lattice , , this can easily be done using a sum-integral comparison. We have for all , , so that

 SbZ(f,q)≤2b−1∫∞q−bf(q′)dq′,

where the right-hand side can be computed explicitly using integrals of Gaussian functions.

For the multidimensional case we use a similar idea: we bound the value of by its mean over a unit cell for which is the vertex of furthest away from the origin (see Figure 11). A technical difficulty lays in the fact that this is not always uniquely defined and that the ’s do not form a proper tiling. For instance, for the two-dimensional square lattice, we have

 SbZ2(f,q) =4∑G∈b(N∗×N∗),|G|≥qf(|G|)+2∑G∈b(Z×{0}),|G|≥qf(|G|) ≤4∑G∈b(N∗×N∗),|G|≥q∫CGf(G′)dG′+2SbZ(f,q). ≤2πb−2∫+∞q−√2bq′f(q′)dq′+2SbZ(f,q).

with a similar bound for the cubic lattice. For a non-cubic lattice , a partitioning in octants has to be done instead based on the signs of the quantities , in the spirit of what can be seen on Figure 11 for a two-dimensional non-square case.

Unit cells then overlap in the vicinity of the three planes and . A relatively straightforward but tedious bound on these extra overlaps yields the following:

 SR∗(f,q) ≤4π|C|∫+∞q−δq′2f(q′)dq′+2π(3∑j=1nR∗,j|Cj|)∫+∞q−δq′f(q′)dq′ +2(3∑j=1nR∗,j|˜Cj|)∫+∞q−δf(q′)dq′, (9)

with the unit cell diameter and

 |C| =|(b1×b2)⋅b3| |Cj| =∣∣ ∣∣(bj+1−(bj⋅bj+1)|bj|2bj)×(bj+2−(bj⋅bj+2)|bj|2bj)∣∣ ∣∣ |˜Cj| =∣∣ ∣∣bj−∑k≠j(bk⋅bj)|bk|2bk∣∣ ∣∣ nR∗,j =∏k≠j⌈2+|bj⋅bk||bj|2⌉.

with the convention that . Details of this computation will be published in an upcoming paper.

The bound above is numerically observed to be very pessimistic because it uses values of for , which for a rapidly decaying function are much larger than . In order to obtain a better error, we instead apply this bound to the function .