    # Adaptive First-Order System Least-Squares Finite Element Methods for Second Order Elliptic Equations in Non-Divergence Form

This paper studies adaptive first-order least-squares finite element methods for second-order elliptic partial differential equations in non-divergence form. Unlike the classical finite element method which uses weak formulations of PDEs not applicable for the non-divergence equation, the first-order least-squares formulations naturally have stable weak forms without using integration by parts, allow simple finite element approximation spaces, and have build-in a posteriori error estimators for adaptive mesh refinements. The non-divergence equation is first written as a system of first-order equations by introducing the gradient as a new variable. Then two versions of least-squares finite element methods using simple C^0 finite elements are developed in the paper, one is the L^2-LSFEM which uses linear elements, the other is the weighted-LSFEM with a mesh-dependent weight to ensure the optimal convergence. Under a very mild assumption that the PDE has a unique solution, optimal a priori and a posteriori error estimates are proved. With an extra assumption on the operator regularity which is weaker than traditionally assumed, convergences in standard norms for the weighted-LSFEM are also discussed. L^2-error estimates are derived for both formulations. We perform extensive numerical experiments for smooth, non-smooth, and even degenerate coefficients on smooth and singular solutions to test the accuracy and efficiency of the proposed methods.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In this paper we consider finite element approximations of the following elliptic PDE in non-divergence form:

 −A:D2u =fin Ω, (1.1) u =0on ∂Ω.

Here, the domain is an open and bounded polytope for or , the coefficient matrix

is a positive definite matrix with eigenvalues bounded by

below and above on , but not necessarily differentiable. The righthand side is assumed in .

Note that when , we have the following equation in divergence form, where is taken row-wise:

 −∇⋅(A∇u)+(∇⋅A)⋅∇u =fin Ω, (1.2) u =0on ∂Ω.

The elliptic PDE in non-divergence form arises in the linearization of fully nonlinear PDEs, for example, stochastic control problems, nonlinear elasticity, and mathematical finance. The matrix is not smooth nor even continuous in many such cases. For example, for fully nonlinear PDEs solving by finite element methods, the coefficient matrix of its linearization is possibly only element-wisely smooth if the coefficients containing derivatives of the numerical solution.

Since the matrix is not differentiable, the standard weak notion of elliptic equation is not applicable. The existence and uniqueness of equations of non-divergence form are often based on the classical or strong senses of the solutions, see discussions in [30, 29]. These PDE theories often assume that the domain is convex, the boundary is sufficiently smooth, or some other restrictive conditions on the smoothness of . For a discontinuous , there are possibilities that the solution is non-unique, see an example given in . It is worth to mention that these available theoretical PDE results are all sufficient theories. For example, since the Poisson equation is also an example of the non-divergence equation with , the existence and uniqueness condition of the equation (1.1) dependence on the domain can be very weak.

There are several numerical methods are available for the problem in non-divergence form. Based on discrete Calderon-Zygmond estimates, Feng, Neilan, and co-authors developed finite element methods for problems with a continuous coefficient matrix in [18, 19, 28]. For equations with discontinuous coefficients satisfying the Cordes condition, a discontinuous Galerkin method , a mixed method , and a non-symmetric method  are developed. A weak Galerkin method is developed by Wang and Wang in . The analysis of these papers mostly assumes the full regularity of the operator and studies the -error estimates of the approximations. In some sense, these methods keep the non-divergence operator second order and borrow techniques from variational fourth-order problems. Nochetto and Zhang  studied a two-scale method, which is based on the integro-differential approach and focuses on error estimates.

Traditionally, the finite element method is based on the variational formulation of an elliptic equation, where the integration by parts plays an essential role. The integration by parts can shift a derivative from the trial variable to the test variable, thus reduces the differential order of the operator. For (1.1), the integration by parts is not available. Luckily, there is another natural method to reduce the differential order of a PDE operator by introducing another auxiliary variable. We can reduce the second order equation into a system of a first order equation by using the new auxiliary variable. Normally, for the first-order system, we can two approaches. One is the mixed method which also involving the integration by parts and has difficulties to ensure the stability. The other method is the least-squares finite element method (LSFEM). The first-order system least-squares principle first re-write the PDE into a first-order system, then define an artificial, externally defined energy-type principle. The energy functional can be defined as summations of weighted residuals of the system. With the first-order least-squares functionals, corresponding LSFEMs can be defined. No integration by parts is needed to define the least-squares principle and thus the LSFEM, thus the first-order system LSFEM is ideal for the second order elliptic equation of non-divergence form.

Beside the obvious advantage of non-requirement of integration by parts, the LSFEM has other advantages. First, the least-squares weak formulation and its associated LSFEM using conforming finite element spaces are automatically coercive as long as the first-order system is well-posed. This is a significant advantage over other numerical methods since the well-posedness theory of the equation in non-divergence form is in general only sufficient. On the other hand, even without a rigorous mathematical proof, the elliptic equation in non-divergence form is often a result of some physical process that we are sure that a unique solution exists. Thus, in LSFEMs developed in this paper, we can reduce the condition of the PDE into a simple well-posedness without specifing the condition explicitly.

The other advantages of LSFEMs include conforming discretizations lead to stable and, ultimately, optimally accurate methods, the resulting algebraic problems are symmetric, positive definite, and can be solved by standard and robust iterative methods including multigrid methods.

The last important advantage of the LSFEMs is that it has a build-in a posteriori error estimator. The solution is probably singular due to the geometry of the domain or the coefficient matrix. Also, for problems like reaction diffusion equations, interior or boundary layers appear. To solve these problems efficiently, the a posteriori error estimator and adaptive mesh refinement algorithm are necessary.

In this paper, by introducing the gradient as an auxiliary variable, we first write the equation in non-divergence form into a system of first-order equations, then develop two least-squares minimization principles and two corresponding LSFEMs: one is based on an -norm square sum of the residuals and the other is based on a mesh-size weighted -norm square sum of the residuals. The two methods are called -LSFEM and weighted-LSFEM, respectively. For the -LSFEM, simpliest linear -finite elements are used to approximate both the solution and the gradient. For the weighted-LSFEM, the -finite element of degree , is used to approximate the solution, while the degree -finite element is used to approximate the gradient. Under the very weak assumption that the coefficient and domain is good enough to guarantee the existence and uniqueness of a solution, we show both continuous least-squares weak forms and their corresponding discrete problems are well-posed. A priori and a posteriori error estimates with respect to the least-squares norms are then discussed.

Numerical methods for non-divergence equation often use the following operator regularity assumption

 ∥u∥2≤C∥A:D2u∥0

to derive stability and error estimates. Unlike these papers, for the weighted-LSFEM, the error estimates of error in the -norm and the discrete broken -norm are investigated with a weaker assumption, see our discussion in section 4.2. Under stronger regularity assumptions, we show that the -norm of the error of the solution is one order higher than the least-squares norm of the error, providing the approximation degree for the solution is at least three. For the -LSFEM, we show the optimal and -error estimates with a solution regularity assumption. We perform extensive numerical experiments for smooth, non-smooth, and even degenerate coefficients on smooth and singular solutions to test the accuracy and efficiency of the proposed methods. With uniform refinements, we show the convergence orders match with the theory. With adaptive mesh refinements, optimal convergences results are obtained for singular solutions.

The LSFEM is well developed for the elliptic equations in divergence form, see for example, [9, 10, 5, 3, 8, 14]. A posteriori error estimates and adaptivity algorithms based on LSFEMs can be founded in [2, 13]. Compared the the LSFEMs for the elliptic equation in divergence form, the non-divergence equation has many differences in the stability analysis and choices of the finite element sub-spaces due to the non-divergence structure. We remark these differences in the various places of the paper as comparisons.

In a summary, the LSFEMs developed in this paper have several advantages compared to existing numerical methods: they are automatically stable under very mild assumption; they are easy to program due to that only simple Lagrange finite elements without jump terms are used; adaptive algorithms with the build-in a posteriori error estimators can handle problems with singular solutions or layers; under a condition on the operator regularity which is weaker than traditionally assumed, error estimates in standard norms are proved.

There are two least-squares finite element methods available for the non-divergence equation. None of them use a first-order reformulation. The paper  uses a second order least-squares formulation with -finite element approximations. The simple method developed by Ye and Mu in  uses -finite element spaces with orders higher than two and penalize the continuity of the solution and the normal component of the flux.

The remaining parts of this article are as follows: section 2 defines the first-order system least-squares weak problems and discusses their stabilities; section 3 presents the corresponding LSFEMs and their a priori and a posteriori error estimates in least-squares norms. Error estimates in other norms are discussed in sections 4 and 5 for the weighted and versions of methods, separately. Numerical experiments are presented in section 6.

Standard notation on function spaces applies throughout this article. Norms of functions in Lebesgue and Sobolev space () are denoted by . The subscript is omitted when . The inner product of real-valued matrices is denoted by . We use to denote the Hessian of .

## 2 First-Order System Least-Squares Weak Problems

### 2.1 Existence and uniqueness assumption

Define the solution space of (1.1):

 V:={v∈H10(Ω) and A:D2v∈L2(Ω)}. (2.1)

Notice that the space is weaker then , since we can expect that even though an individual , is not in , due to cancelation or good properties of , belongs to . For example, let and is the solution of the Poisson equation on an L-shaped domain , then but clearly .

We first state the assumption of the existence and uniqueness of the solution.

###### Assumption

(Existence and uniqueness of the solution of the elliptic equation in non-divergence form) Assume that the coefficient matrix and the domain are nice enough, such that the equation (1.1) has a unique solution for any .

###### Remark

There are various theories to ensure the existence and uniqueness of the equation, for example:

1. (Classical solution ) A classical solution exists if is Hölder continuous and if is sufficiently smooth.

2. (Strong solution ) If , a vanishing mean oscillation matrix with a uniform VMO-modulus of continuity, and if is of class , then there exists a unique solution to the problem.

3. ( solution ) If the domain is convex and if satisfies the Cordes condition (4.24), then there exists a solution .

More detailed discussions can be found in the introduction of . We do find these theories are only sufficient theories. Besides many examples of Poisson equations on non-convex domains, the Test 3 problem from ( see also our numerical test 6.4) is an example that the matrix is not uniformly elliptic but a unique solution still exists.

### 2.2 Least-squares problems

Introduce a new gradient variable , we have the following first-order system:

 ⎧⎪⎨⎪⎩\boldmathσ−∇u=0,in Ω,−A:∇\boldmathσ=f,in Ω,u=0,on ∂Ω. (2.2)

It is clear that . For the gradient , the appropriate solution space is:

 Q:={\boldmathτ∈L2(Ω)d:A:∇\boldmath% τ∈L2(Ω)}.
###### Remark

As a comparison, consider the equation in divergence form

 −∇⋅(A∇u) =fin Ω, (2.3) u =0on ∂Ω.

It is well known that the flux for . The space is well studied [22, 4]

. One of its property is that the normal component of its member vector function is continuous across the interfaces in the weak sense. Also, the negative divergence operator is the the dual operator of the gradient. These properties play important roles in the design of least-squares finite element methods for elliptic equations in divergence form. More importantly,

-conforming finite element spaces such as Raviart-Thomas (RT) finite element space  with good approximation properties are also well known.

For the elliptic equation non-divergence form (1.1), the property of the space is barely known. Similar to the space , for a vector function , we can expect that even though an individual , is not in , due to cancelation or good properties of , may belong to . But if we want to design a numerical method for a general non-divergence elliptic equation, we basically cannot assume or use any of these information, and we can not design an -intrinsic finite element subspace of as -conforming finite elements.

Let be a triangulation of using simplicial elements. The mesh is assumed to be shape-regular, but it does not to be quasi-uniform. Let be the diameter of the element .

We introduce two versions of least-squares functionals:

 Jh(v,\boldmathτ;f) := ∑K∈Th2K∥f+A:∇% \boldmathτ∥20,K+∥\boldmathτ−∇v∥20,∀(v,\boldmathτ)∈H10(Ω)×Q, (2.4) J0(v,\boldmathτ;f) := ∥f+A:∇\boldmathτ∥20+∥% \boldmathτ−∇v∥20,∀(v,\boldmathτ)∈H10(Ω)×Q. (2.5)

The functionals and are called the weighted version and the version, respectively. We use the notation to denote both and when two formulations can be presented in a unified framework and no confusion is caused.

The least-squares minimization problem is: seek , such that

 J(u,\boldmathσ;f)=inf(v,\boldmathτ)∈H10(Ω)×QJ(v,\boldmathτ;f). (2.6)

The corresponding Euler-Lagrange formulations are: seek , such that

 ah((u,\boldmathσ),(v,\boldmathτ))=−∑K∈Th2K(f,A:∇\boldmathτ)K,∀(v,\boldmathτ)∈H10(Ω)×Q, (2.7)

and find , such that

 a0((u,\boldmathσ),(v,\boldmathτ))=−(f,A:∇\boldmathτ),∀(v,\boldmathτ)∈H10(Ω)×Q, (2.8)

where for all and , the bilinear forms are defined as

 ah((w,\boldmathρ),(v,\boldmathτ)) := (\boldmathρ−∇w,\boldmathτ−∇v)+∑K∈Th2K(A:∇\boldmathρ,A:∇\boldmathτ)K, anda0((w,\boldmathρ),(v,% \boldmathτ)) := (\boldmathρ−∇w,\boldmathτ−∇v)+(A:∇\boldmathρ,A:∇\boldmathτ).
###### Remark

The least-squares formulations can be easily extended to more general cases, for example, non-homogeneous Dirichlet boundary conditions and equations with convection and advection terms.

For example, for the general elliptic equation

 −A:D2u+\boldmathβ⋅∇u+cu=f (2.9)

with homogeneous boundary condition, let , the least-squares functional can be defined as:

 J0(v,\boldmathτ;f):=∥f+A:∇% \boldmathτ−\boldmathβ⋅\boldmathτ−cv∥20+∥\boldmathτ−∇v∥20,∀(v,\boldmathτ)∈H10(Ω)×Q.

To have a better robustness with respect to the coefficients, coefficient-weighted versions can also be used. For example, define the least-squares functional as:

 J0(v,\boldmathτ;f):=∥γ(f+A:∇% \boldmathτ−\boldmathβ⋅\boldmathτ−cv)∥20+∥A1/2(\boldmathτ−∇v)∥20,∀(v,% \boldmathτ)∈H10(Ω)×Q,

where is a weight defined as a function of the coefficients , , and .

The -weighted least-squares functional can be defined similarly.

###### Remark

For the weighted functional , the -weight is on the term , similarly, we can also use,

 ~Jh(v,\boldmathτ;f):=∥f+A:∇%\boldmath$τ$∥20+∑K∈Th−2K∥% \boldmathτ−∇v∥20,K,∀(v,\boldmathτ)∈H10(Ω)×Q,

as the weighted least-squares functional. For a uniform mesh, and are equivalent. But for an adaptively refined mesh, beahives more like a minimization problem with respect to the -norm of while is more like an optimization with respect to the -norm. We prefer the version in this paper since the minimum requirement of is not . Earlier discussion on the mesh-dependent least-squares methods can be found in .

###### Lemma

The following are norms for :

 |||(v,\boldmathτ)|||2h := ∑K∈Th2K∥A:∇\boldmathτ∥20,K+∥\boldmathτ−∇v∥20, and|||(v,\boldmathτ)|||20 := ∥A:∇\boldmathτ∥20+∥% \boldmathτ−∇v∥20.

We use to denote both versions when no confusion is caused.

###### Proof

To prove that defines a norm on , we only need to check conditions of a norm definition.

The linearity and the triangle inequality are obvious for .

If , due to the fact , we have , thus

 A:∇\boldmathτ=0and\boldmathτ=∇v,

in the sense. This means, , and

 A:D2v=0 in Ω,v=0 on ∂Ω,

is true in the sense. By the existence and uniqueness of assumption of the solution Assumption 2.1, and . The norm is then well defined for both the weighted and versions of definition.

###### Remark

The condition is essential to the definition. This condition has the same role as the requirement of flux in for the equation in divergence form, which implicitly implies some weak continuity condition of its member functions.

###### Remark

It is also clear that

 |||(v,\boldmathτ)|||2h,K := h2K∥A:∇\boldmathτ∥20,K+∥\boldmathτ−∇v∥20,K and|||(v,\boldmathτ)|||20,K := ∥A:∇\boldmathτ∥20,K+∥% \boldmathτ−∇v∥20,K,

are semi-norms on an element .

###### Lemma

The bilinear form or is continuous and coercive:

 a((w,\boldmathρ),(v,\boldmathτ)) ≤ |||(w,\boldmathρ)||||||(v,\boldmath% τ)|||,∀(w,\boldmathρ) and (v,% \boldmathτ)∈H10(Ω)×Q, (2.10) a((v,\boldmathτ),(v,\boldmathτ)) = |||(w,\boldmathρ)|||2,∀(v,%\boldmath$τ$)∈H10(Ω)×Q. (2.11)

The lemma can be easily proved by a simple computation.

###### Theorem

Assume that , the coefficient matrix and the domain are nice enough such that Assumption 2.1 is true, then the least-squares problem (2.6) has a unique solution .

###### Proof

To prove the existence, for , by Assumption 2.1, there exists a unique solving the equation. Let , it is easy to that and , thus the least-squares functional (2.6) has a minimizer with minimum value zero. The minimizer is then the solution of the least-squares problem (2.6) and its corresponding Euler-Lagrange equation. The uniqueness is a simple consequence of the fact is a norm.

###### Remark

The above proof can be easily generalize to the case that the Drichelet boundary condition is not homogeneous.

The above argument to show the existence and uniqueness of the least-squares formulation is useful when the existences and uniqueness of the PDE is obtained from various non-variational techniques. A similar argument is used to prove the stability of least-squares formulations for the linear transport equation in .

###### Remark

Here, the assumption of the coefficient is quite weak. The matrix does not need to be in , it can even be degenerate as long as Assumption 2.1 still holds.

###### Remark

For the elliptic equation in divergence form (2.3), traditionally there are two forms on least-squares functionals [9, 5]:

 L0(v,\boldmathτ;f) := ∥∇⋅\boldmathτ−f∥20+∥A−1/2% \boldmathτ+A1/2∇v∥20,∀(v,\boldmathτ)∈H10(Ω)×H(div;Ω), L−1(v,\boldmathτ;f) := ∥∇⋅\boldmathτ−f∥2−1+∥A−1/2% \boldmathτ+A1/2∇v∥20,∀(v,\boldmathτ)∈H10(Ω)×H(div;Ω).

A norm equivalence can be proved: there exists positive constant and , such that for :

 C1(∥\boldmathτ∥2H(div)+∥v∥21)≤L0(v,\boldmathτ;0)≤C2(∥\boldmathτ% ∥2H(div)+∥v∥21)

and

 C1(∥\boldmathτ∥20+∥v∥21)≤L−1(v,\boldmathτ;0)≤C2(∥\boldmathτ∥20+∥v∥21).

Our least-squares functional essentially is a modification of . But due to the lack of the differentiability of , we cannot prove the following norm equivalence:

 C1(∥\boldmathτ∥20+∥A:∇\boldmathτ% ∥20+∥v∥21)≤J0(v,\boldmathτ;0)≤C2(∥\boldmathτ∥20+∥A:∇\boldmathτ∥20+∥v∥21). (2.12)

On the other hand, if is smooth enough, then (1.1) can be written in the divergence form as (1.2), we do can prove (2.12) using the same technique for the equation divergence form with similar arguments in [8, 24].

However, for our least-squares functionals, we do have a one-sided bound, which can be easily proved for :

 C|||(v,\boldmathτ)|||0 ≤ ∥A:∇\boldmathτ∥0+∥\boldmathτ∥0+∥∇v∥0, (2.13) C|||(v,\boldmathτ)|||h ≤ ∑K∈ThK∥A:∇\boldmathτ∥0,K+∥\boldmathτ∥0+∥∇v∥0. (2.14)

We do not use the minus- norm version in this paper due to its complicated discrete implementation, in stead, we choose a weighted mesh-dependent version to simplify the implementation and keep an optimal order of convergence.

## 3 Least-Squares Finite Element Methods

In this section, LSFEMs based on the least-squares minimization problems are developed. The a priori and a posteriori error estimates with respect to the least-squares norms are derived.

### 3.1 Least-squares finite element methods

For an element and an integer , let the space of polynomials with degrees less than or equal to . Define the finite element spaces and , , as follows:

We define the LSFEMs are follows.

(Weighted-LSFEM Problem) Seek , , such that

 Jh(uh,\boldmathσh;f)=inf(v,% \boldmathτ)∈Sk,0×Sdk−1Jh(v,% \boldmathτ;f). (3.1)

Or equivalently, find , , such that

 ah((uh,\boldmathσh),(vh,\boldmathτh))=−∑K∈Th2K(f,A:∇\boldmathτh),∀(vh,\boldmathτh)∈Sk,0×Sdk−1. (3.2)

(-LSFEM Problem) Seek , such that

 J0(uh,\boldmathσh;f)=inf(v,% \boldmathτ)∈S1,0×Sd1J0(v,% \boldmathτ;f). (3.3)

Or equivalently, find , such that

 a0((uh,\boldmathσh),(vh,\boldmathτh))=−(f,A:∇\boldmathτh),∀(v,\boldmathτh)∈S1,0×Sd1. (3.4)

The existence and uniqueness of the LSFEM problems are obvious from the facts that

 Sdk⊂H1(Ω)d={\boldmathτ∈L2(Ω)d:∇\boldmathτ∈L2(Ω)d}⊂Q

and .

###### Remark

For the approximation space of , the -conforming space is an obvious good choice. For the approximation space of , we use . The space is more restrictive than , but it has a simple conforming finite element space. The -conforming space is not the best choice if further information of is known. For example, in the case of the equation of divergence form, it is well known that and . Similarly for the non-divergence equation, for problems with low regularity and possible discontinuous coefficients, may not be in , then in some extreme case we will have even though is identical to the exact solution , cannot equal to at the same time, since may not be in . Such cases will pose problems of a posteriori error estimation and adaptive mesh refinements, see the discussions in  and  for the failure of the classical Zienkiewicz-Zhu error estimator, which recovers in to construct the a posteriori error estimator. Even though we do have this concern, due to the lack of information of and to keep the method suitable for a general coefficient matrix , the -conforming space to approximate is still a reasonable choice.

It is also worth to mention that, we also do not know whether or not for the non-divergence equation. In the the formulations suggested in  and , the normal jump of are used. Thus, these formulations have a similar possible inconsistency as our methods when applied to problems with less smoothness, where the exact jump of across element interfaces may not be zero for an exact , and have a possibility to introduce an extra error.

### 3.2 A priori error estimates

###### Theorem

(Cea’s lemma type of result) Let be the solution of least-squares variational problem (2.6). Let , , be the solution of the weight-LSFEM problem (3.1), the following best approximation result holds:

 |||(u−uh,\boldmathσ−\boldmathσh)|||h≤inf(vh,\boldmathτh)∈Sk,0×Sdk−1|||(u−vh,\boldmathσ−\boldmathσh)|||h. (3.5)

Let be the solution of the -LSFEM problem (3.3), the following best approximation result holds:

 |||(u−uh,\boldmathσ−\boldmathσh)|||0≤inf(vh,\boldmathτh)∈S1,0×Sd1|||(u−vh,\boldmathσ−\boldmathσh)|||0. (3.6)

###### Proof

The proof of the best approximation result is standard.

###### Theorem

(A priori error estimate for the weighted-LSFEM) Assume the solution , for some , and , be the solution of the weighted LSFEM problem (3.1), then there exists a constant independent of the mesh size , such that

 |||(u−uh,\boldmathσ−\boldmathσh)|||h≤Chmin(k,r+1)∥u∥min(k,r+1). (3.7)

###### Proof

It is easy to see that

 C|||(v,\boldmathτ)|||h≤∑K∈ThK∥A:∇\boldmathτ∥0,K+∥∇v∥0+∥% \boldmathτ∥0. (3.8)

Then the a priori result is a direct consequence of Theorem 3.2 and the approximation properties of functions in .

###### Theorem

(A priori error estimate for the -LSFEM) Assume the solution and be the solution of the -LSFEM problem (3.3), then there exists a constant independent of the mesh size , such that

 |||(u−uh,\boldmathσ−\boldmathσh)|||0≤Ch∥u∥3. (3.9)

###### Proof

We have

 C|||(v,\boldmathτ)|||0≤∥A:∇% \boldmathτ∥0+∥∇v∥0+∥\boldmathτ∥0. (3.10)

Then let

be the interpolation of

in and be the interpolation of in , by the approximation properties of and the fact that , we have

 C|||(u−uh,\boldmathσ−\boldmathσh)|||0 ≤ ∥A:∇(\boldmathσ−\boldmathτh)∥0+∥\boldmathσ−\boldmathτh∥0+∥∇(u−vh)∥0 ≤ Ch∥∇u∥2+Ch∥u∥2+Ch2∥∇u∥2≤Ch∥u∥3.

The theorem is proved.

###### Remark

From the a priori estimates, we can clearly see that with respect the least-squares norm, the weighted version is optimal when the regularity is high and a suitable high order finite element pair is used. For the -LSFEM, the optimal interpolation order for the -norm of is , which is one order high that the other two components, and thus sub-optimal, thus high order approximations of the -LSFEM is not suggested. But the -LSFEM can use the simplest linear conforming finite element space for , and has reasonable approximation orders, for example, we will find the -estimates of is the same order as the the weighted-LSFEM if assuming enough smoothness of the coefficient and the solution, see Theorems 4.3 and Theorems 5 and our numerical tests.

###### Remark

In the a priori error estimates, in order to get a convergence order, we assume that the regularity of is at least or . To discuss the convergence without a high regularity, we first assume that the coefficient matrix is nice enough such that for any , there exists a , such that

 ∥\boldmathσ−\boldmathσϵ∥0≤ϵand∥f−A:∇\boldmathσϵ∥0≤ϵ. (3.11)

Note that this condition is weaker than , since may not be in . If the assumption (3.11) is true, then a convergence result is easy to prove for only.

For the special case that , the restriction of the coefficient matrix on each , is a constant matrix, then contains a piecewise constant on each element , we have

 ∥f−A:∇\boldmathτh∥0,K≤ChrK∥f∥r,K,for 0

This can be used to establish the convergence for a solution with a low regularity.

For the standard LSFEM for the divergence, such problems do not exist since Raviart-Thomas element is used to approximation , and the regularity requirement is on , which is weaker than the requirement on . Again, this is due to the special structure of the divergence equation, which is not available for the general non-divergence equation.

### 3.3 A posteriori error estimates

The least-squares functional can be used to define the following fully computable a posteriori local indicator and global error estimator.

#### 3.3.1 Weighted-LSFEM

Let be the solution of least-squares variational problem (2.6), and , be the solution of the weighted-LSFEM problem (3.1), then define:

 η2h,K := h2K∥f+A:∇\boldmathσh∥20,K+∥\boldmathσh−∇uh∥20,K,∀K∈T, andη2h := ∑K∈Tη2h,K=∑K∈Th2K∥f+A:∇\boldmathσh∥20,K+∥% \boldmathσh−∇uh∥20.
###### Theorem

The a posteriori error estimator is exact with respect to the least-squares norm :

 ηh=|||(u−uh,\boldmathσ−\boldmathσh)|||handηh,K=|||(u−uh,\boldmathσ−\boldmathσh)|||h,K.

The following local efficiency bound is also true with a constant independent of the mesh size :

 Cηh,K≤hK∥A:∇(\boldmathσ−\boldmathσh)∥0,K+∥\boldmathσ−\boldmathσh∥0,K+∥∇(u−uh)∥0,K,∀K∈T. (3.12)

###### Proof

Using and , we obtain,

 η2h = ∑K∈Th2K∥A:∇(\boldmath% σ−\boldmathσh)∥20,K+∥\boldmathσh−\boldmathσ−∇(uh−u)∥20=|||(u−uh,% \boldmathσ−\boldmathσh)|||2h.

The proof of the local exactness is identical.

The locally efficiency (3.12) is a direct result of a local version of (3.8).

#### 3.3.2 L2-Lsfem

For the -LSFEM, the a posteriori error estimator can be defined accordingly, and the corresponding results can be proved in a similar fashion.

Let be the solution of least-squares variational problem (2.6), and be the solution of the -LSFEM problem (3.3), define:

 η20,K:=∥f+A