Molecular diffusion is a fundamental mode of mass transport. Within a stationary solution containing a dilute solute species of concentration , the classical model for diffusion was formulated by Fick, which postulates that the solute’s molar flux obeys
in which is the solute’s Fickian diffusivity in the solution. Maxwell1867 applied kinetic theory to derive Fick’s law for binary ideal-gas diffusion, showing for isothermal gases that further relates to a composition-independent constant material property. Stefan1871 extended Maxwell’s analysis to multicomponent gases, expressing the gradient of each species concentration in terms of a matrix of binary diffusivities. The resulting Stefan–Maxwell equations (also commonly called Maxwell–Stefan equations in the engineering and mathematical literature) have been verified experimentally for gas diffusion in studies such as Duncan1969 and 1975Carty.
Using his theory of irreversible thermodynamics, Onsager1931first; Onsager1931second; Onsager1945 provided a broader theoretical framework for mass transport that could also be applied to multicomponent diffusion in nonideal phases, including liquids and/or solids. hirschfelder1954molecular substantiated this more abstract analysis, manipulating thermodynamic laws and hydrodynamic equations to construct the diffusion driving forces for general nonisobaric, nonisothermal, multicomponent diffusion systems. Combined with Lightfoot, Cussler and Rettig’s observation that the Stefan–Maxwell diffusivities map invertibly into Onsager’s transport matrix, and can therefore be used for condensed phases as well as gases (Lightfoot1962), this extended the Stefan–Maxwell theory to all molecular diffusion processes. Newman1965 brought the generalization further, accounting for materials containing charged solutes, thereby completing the development of the contemporary Stefan–Maxwell equations. Modern expositions of the theory can be found in KRISHNA1997861; Krishna1979 and DATTA20105976.
Given a bounded Lipschitz domain , the Stefan–Maxwell equations describing the diffusion of the species that constitute a common phase at a given absolute temperature are given by
for , in which is the ideal gas constant. The terms and denote the concentration and velocity of species respectively, related to the molar flux of the species by . For species , represents the Stefan–Maxwell diffusivity of species through species ; these material parameters are symmetric in the species indices, , and coefficients are not defined. The term in equation (1.2) denotes the total concentration, defined as
and the terms represent the diffusion driving forces, which generally depend on the species concentrations, temperature and pressure. In the case of isothermal, isobaric ideal-gas diffusion considered here, . Furthermore, an ideal gas satisfies the equation of state
in which is the pressure. Hence in the isothermal, isobaric setting, is a constant. To pose the Stefan–Maxwell convection-diffusion problem, flux constitutive laws (1.2) and the equation of state (1.4) are coupled to the continuity equations
where is a specified volumetric reaction rate, which quantifies the generation or depletion of species by homogeneous chemical reactions. Under the ideal-gas assumption considered in this paper, we are interested in solving (1.2) (with ) and (1.5) for the species concentrations and their respective velocities .
The Stefan–Maxwell equations have found a diverse range of applications in areas such as biology, electrochemistry, and plasma physics. For specific examples, we refer the reader to the studies by Boudin2010; Bruce1988; ABDULLAH20075821; newman2012electrochemical; LIU2014447 and Maxwell--Stefan-plasma. Because they account for solute/solute interactions as well as solute/solvent interactions, Stefan–Maxwell models can exhibit fundamentally different behaviour from Fickian models. For example, in Duncan1969 ‘uphill diffusion’ is observed, wherein the directions of a species’ molar flux and its concentration gradient coincide, in contradiction to (1.1). In electrochemistry, the Stefan–Maxwell formalism justifies surprising observations like negative transference, where the flow of an electric current with one sign carries ions of opposing sign along with it (Monroe2013).
Under restrictive assumptions, a multicomponent extension of Fick’s law known as dilute solution theory can be recovered. For a dilute set of species in the presence of a solvent in far greater proportions, , one can formally neglect the terms in (1.2) whenever both and take , allowing rearrangement to express the molar fluxes as
For dilute solutes the driving forces often take the form , where is known as a Darken factor (Darken48), in which case one can identify as the Fickian diffusivity of species in the solution. Writing equations (1.6) for all solutes and replacing with the barycentric velocity produces the dilute solution theory. We direct the reader to newman2012electrochemical for further details. When the solute driving forces within dilute solution theory are written in terms of both concentration gradients and the electric field, (1.6) is referred to as a Nernst–Planck relationship, based on the work by Nernst1888 and Planck1890. Nernst–Planck equations have been extensively studied in the mathematical literature, sometimes coupled with Poisson’s equation to account for the distribution of the electric potential, and a Navier–Stokes equation or a Darcy flow to compute the velocity . We refer the reader to 2009Schmuck; EUDNPPequation; Jinchao2015 and Jinchao2018 and the references therein for the existing mathematical literature on the Nernst–Planck equation. In many cases dilute solution theory is not appropriate, and the full Stefan–Maxwell equations must be considered. A comparison of Fickian to Stefan–Maxwell diffusion profiles for gases can be found in KRISHNA1997861 and Boudin2010. Examples of the limitations of dilute-solution models are discussed in the context of lung modelling, earth science and electrolyte transport in the studies by CHANG1975109; Arthur1990 and MonroeNernstPlanck2016 respectively.
1.1 Physical structure and consequences
The mathematical structure of irreversible thermodynamics will prove useful below for devising discretizations and error estimates for the Stefan–Maxwell diffusion problem. We therefore summarize some key points of the theory. We begin with the transport equations postulated by Onsager1945 for isotropic materials,
in which the statistical reciprocal relations developed in Onsager1931first; Onsager1931second require the transport matrix to be real symmetric.
The Onsager transport equations (1.7) were developed independently of the Stefan–Maxwell theory (1.2). It was subsequently realized by Lightfoot1962 that the Stefan–Maxwell equations could be understood in terms of Onsager’s transport matrix by identifying
as the entries of M.
Time evolution of nonequilibrium states leads to local entropy production, denoted by . For an isothermal, isobaric system with a given collection of species velocities experiencing nonequilibrium diffusion driving forces, the balances of material, momentum, and heat are manipulated in hirschfelder1954molecular and Monroe2009 to write the entropy production of isothermal diffusion as
in which the are identified as functions of the gradients of temperature, pressure, and by grouping terms in the Gibbs–Duhem equation from equilibrium thermodynamics (deGrootMazur; hirschfelder1954molecular; Goyal01012017). In general there may be other terms in (1.9) such as viscous dissipation or reaction entropy, but these will be neglected here.
The second law of thermodynamics further demands that the energy dissipation is non-negative, , with equality only in an equilibrium state, which is defined by the condition that for all . Thermodynamic stability therefore requires further that M be positive semidefinite.
Some additional structure of the transport matrix is specific to multicomponent mass diffusion. Importantly, the theory must guarantee that diffusional motion, driven by thermodynamic property gradients, remains distinct from species convection, a non-dissipative process driven by bulk flow. This distinction is made by requiring that (1.7
) be invariant to a shift of every species velocity by a vector field, i.e. the equation remains unchanged when each in (1.7) is replaced by . The essential physical distinction between diffusion and convection consequently requires that
as noted by Onsager1945 and Helfand1960. Hence M. Invariance with respect to the convective velocity is naturally embedded in the Stefan–Maxwell form (1.2), because .
The symmetry of M suggested by Onsager1945 requires that , a fact that has also been demonstrated directly for Stefan–Maxwell diffusion by fluctuation theory in Monroe2015. The symmetry of the transport matrix combined with the nullspace (1.10) allows recovery of the full Gibbs–Duhem equation, namely
In the context of transport theory, equation (1.11) can be seen as a statement of Newton’s third law of motion, that action equals reaction. In thermodynamics this is necessary to be consistent with the first law of thermodynamics and the extensivity of the Gibbs free energy.
Reasoning physically that all diffusion processes are necessarily dissipative, Onsager1945 makes the stronger assumption that M has exactly one null eigenvalue. Taken together, the physical arguments require that M is symmetric positive semidefinite, and that its eigenvalues, , may be ordered as
a spectral structure that will be used throughout this paper.
At positive concentrations, energy dissipation occurs whenever there is relative species motion, implying that the equality in (1.13) occurs if and only if .
One must take care to note that M may afford additional nullspaces beyond (1.10) if any concentration vanishes. Consequently, in order to phrase the Stefan–Maxwell equations in terms of Onsager’s transport laws (1.7) with a transport matrix M that possesses the spectral structure (1.12), it will be necessary to assume that almost everywhere for each . We make this assumption henceforth.
Because the present discussion is limited to ideal-gas mixtures, it can be assumed that the Stefan–Maxwell diffusion coefficients are given constants, which places even stronger restrictions on their values. Whenever the concentrations satisfy for each and any positive constant , then for a positive constant which depends only on , a fact that will be used throughout the paper. From the calculation (1.13), it follows that a necessary and sufficient condition for (1.12) to be true for all positive concentrations is that each is strictly positive (Krishna1979). It must be stressed, however, that the Stefan–Maxwell diffusion coefficients in many physical systems depend strongly on the concentrations of the species, in which case negative Stefan–Maxwell diffusion coefficients are not only possible, but are observed and of practical interest (Gerrit1993; NegativeMScoefficients). Therefore in order to present a general framework for multispecies diffusion, the results in this paper only use the spectral structure (1.12), not the positivity of the Stefan–Maxwell diffusion coefficients.
In systems with more than one spatial dimension, the existence of the nullspace (1.10) means that the problem (1.2), (1.3), (1.5) will not be well-posed unless a choice of convective velocity is made (see Remark 3 below for the one-dimensional case). This can be done by specifying that the mass-flux must equal given data :
where is the molar mass of species . In general, the mass-flux must also be solved for via the Cauchy momentum equation, which in the absence of a pressure gradient or an external force field, can be written in conservation form as
where the density is defined as
denotes the deformation stress tensor appropriate for the medium. We refer to the problem of solving (1.2), (1.5), (1.14) and (1.15) as the Stefan–Maxwell convection-diffusion problem. In this work we assume that is given and focus on the solution of (1.2), (1.5) and (1.14) under an additional steady-state assumption, which we call the steady-state Stefan–Maxwell diffusion problem.
1.2 Premise and main results
The central idea of this manuscript is to incorporate the constraint (1.14) by augmenting (1.2), in a manner inspired by the augmented Lagrangian approach (Bochev2006; Fortin1983). Given , for each we multiply both sides of (1.14) by and add the resulting term to the equation of (1.2) to deduce that
for , where is the augmented transport matrix
Our particular choice of the entries of allows us to compute
to show that the augmented transport matrix is symmetric positive definite. The positive-definiteness achieved by this augmentation will cause the associated bilinear forms in the variational formulation to follow to be coercive, greatly facilitating the analysis.
The paper is organized as follows. Section provides an overview of the existing numerical literature on the Stefan–Maxwell equations and contrasts our approach with previous efforts. In section we derive a suitable weak formulation for the problem and prove well-posedness of a linearized system of (1.2)-(1.5) in section 4. In section we show stability of a discretization of this linearized system and prove error estimates for the linearization. Finally, in section we verify our error estimates with a manufactured solution and illustrate our method by simulating the interdiffusion of oxygen, carbon dioxide, water vapour and nitrogen in the lungs.
2 Existing numerical literature
Despite their wide applicability, the Stefan–Maxwell equations have received relatively little attention from numerical analysts. In nearly all existing work, the equations are formulated in terms of the molar flux . The interdependence among the collection of driving forces implied by Gibbs–Duhem relation (1.11) allows the equation for to be discarded. The mass-flux constraint (1.14) is then used to eliminate the species velocity from the system. Following this process, a non-singular matrix A is derived which satisfies
One can then proceed to solve for the molar fluxes in terms of the driving forces by inverting A. If, for example, we have , the inverted, truncated flux laws can be substituted into the continuity equations (1.5) for species to yield
Thus one obtains evolution equations for the concentrations, having eliminated the molar fluxes completely. Papers which take this approach and analyse the resulting equations to determine some existence and uniqueness properties include 2012Boudin; Bothe2010; Jungel2012 and Jungel2019. 2012Boudin and Jungel2019 also analyse numerical schemes along these lines. It is worth remarking that the matrix is not positive symmetric definite, although, at least in certain circumstances, one can define ‘entropy variables’ so that the resulting system is symmetric positive definite, as carried out by Jungel2019.
The approach of McLeod2014 does not eliminate molar fluxes, but rather solves for them in a mixed saddle point formulation. They then prove well-posedness of a linearized system consisting of three species, under some constraints on the Stefan–Maxwell diffusion coefficients. A discretization using mixed finite elements is then presented and error bounds on the linearized system are obtained. Our paper is similar in scope, but with several key differences and extensions.
First, our approach does not need any rearrangement of (1.14) to eliminate one species, but rather incorporates the constraint via the augmented formulation (1.17). The choice of species to eliminate is somewhat arbitrary, and with the augmentation is no longer necessary. Augmentation also exploits the symmetric positive semidefinite structure of the transport matrix and preserves permutational symmetry of the system. This will be particularly pertinent for anticipated future work where we intend to have more complex driving forces of the form
where is the electrochemical potential of species and is the pressure. These more complex driving forces render rearrangement increasingly intractable.
Second, the symmetric positive definite structure of the augmented transport matrix yields straightforward proofs of the coercivity of bilinear forms on appropriate function spaces. As a consequence, we will prove that the linearized system is well-posed in the continuous and discrete setting and derive error bounds for its discretization in the general case of species. The methodology presented in this paper also encompasses the case where individual Stefan–Maxwell diffusion coefficients may be negative.
Finally, we are able to design the discrete formulation in a structure-preserving way so that the Gibbs–Duhem equation (1.11) is satisfied up to machine precision, independent of mesh size. Previous works instead assume the Gibbs–Duhem equation and use it to infer the concentration of the species in a postprocessing step.
3 Problem formulation
We proceed to cast the problem into variational form. Note that both sides of equation (1.17) are proportional to and hence without loss of generality we assume that . Our idealized assumption on the driving forces then becomes
In this case the Gibbs–Duhem equation (1.11) reduces to
i.e. that total concentration is constant. This is also important as the constancy of is required to be consistent with the equation of state (1.4), which is distinct from the Gibbs–Duhem equation. We assume that and consider the boundary conditions
where n is the outward facing unit normal vector and partition . The equalities in (3.3)-(3.4) are to be understood in the sense of traces (evans2010partial). It is necessary to assume that is positive for each to avoid M acquiring another nullspace at the boundary. Either one of and may be empty. This boundary data is assumed to satisfy
where is a constant that we will show is equal to the total concentration (1.3). These assumptions are necessary to be consistent with the Gibbs–Duhem equation (1.11) and the mass-flux constraint (1.14). Under the steady-state assumption, the species continuity equations (1.5) become
Therefore, we demand that the reaction rates, , satisfy
We define the function space
and the affine function space
We can now derive the weak formulation. We test (1.17) with and integrate over to derive for all ,
for all .
For a given we multiply both sides of (3.7) by and integrate by parts to yield that for all ,
However we can use the nullspace (1.10) and symmetry of M to deduce
and by the definition of the density (1.16), we obtain that
for all . Considering the first and second terms with the choice for some , and using (3.12),
the final equality following from integration by parts. In light of this, (3.15) becomes
a variational statement of the mass-flux constraint (1.14).
With pure Neumann boundary data (), the system (3.11)-(3.12) is not well posed. Observe that if and solve equations (3.11) and (3.12) then so do the variables and for any . In order to make the problem well posed it is necessary to impose auxiliary conditions such as
for known constants . The physical interpretation of this constraint is clear. In the transient dynamics we have the continuity equations
Integrating over and using the divergence theorem we deduce that
For a steady-state solution to exist, it is necessary that the right hand side of this equation is . Therefore, for all time ,
Hence the integral in (3.20) is independent of time and therefore is completely specified by the initial conditions.
4 Linearization and well-posedness
We consider a linearization of Picard type. The general approach is that whenever a velocity is multiplied by a concentration, we replace the concentration with our current guess. The exception to this is explained in Remark 4. Let us define the function spaces , , as well as the affine function space . We set the norm on as . Throughout the rest of this paper we will frequently use the notation to denote an -tuple in one of these function/affine function spaces as well as their discrete subspaces.
Given a previous guess for the concentration , we define a bilinear form given by
for . Here denotes the augmented transport matrix, the entries being defined by using the current guess for the concentration in equations (1.8) and (1.18). Similarly, is the density evaluated using in (1.16).
For the current guess we also define the bilinear form ,
for , and the bilinear form ,
For the linear functional is defined as
The non-linear iteration scheme is as follows. We take an initial guess which satisfies the Dirichlet boundary data (3.3) and
almost everywhere for a given constant , determined by either (3.6) or (3.20). For the next iterate of the sequence is computed as the solution to the following generalized saddle point problem: find such that
subject to the Dirichlet conditions (3.4). This is repeated until
for a set tolerance .
Note that is a weak solution to the non-linear problem (3.11)-(3.12) if and only if it is a fixed point of this iteration scheme. Indeed if is a weak solution to the non-linear problem (3.11)-(3.12) then the solution to the equations (4.6)-(4.7) remains . Conversely if then, converting (4.6)-(4.7) to a non-linear system by replacing with , we recover the non-linear problem (3.11)-(3.12) and observe it is solved with .
We proceed to prove well-posedness of the linear system (4.6)-(4.7) by applying either Theorem in GeneralSaddlepoint or Theorem 3.1 in Generalisedsaddleerror1982. To invoke these theorems we shall prove the following conditions.
Condition 1: There exists a constant such that
for all .
Condition 2: There exist constants , such that for all ,
An alternative to our definition of the linear functional (4.4) would be to replace with and therefore include the term as part of the bilinear functional instead. However, the current formulation (4.6)-(4.7) ensures that we can derive the equivalent of (3.15) for the linearized system
Then, following an argument identical to that presented in section 3, we deduce that for each , the iterates satisfy
almost everywhere. When combined with the assumption that the concentrations are positive almost everywhere, this implies that are all bounded bilinear functionals on their respective function spaces.
The common alternative, to formulate the problem in terms of molar fluxes rather than velocities, has the advantage that the continuity equations do not need to be linearized. However, a disadvantage is that the resulting bilinear form is no longer symmetric or coercive, which would add significant difficulty to the analysis.
In order to prove (4.9) it will be useful to write the bilinear form, as the integral of a quadratic form. For this purpose it is useful to define the matrix
where I is the identity matrix and is the Kronecker product. We can then write the bilinear form as
To show the coercivity condition (4.9) we must show for some
Hence (4.9) is satisfied if and only if is uniformly positive definite over almost everywhere. Either by direct calculation, or by using a standard property of the Kronecker delta product, one can verify that will have the same eigenvalues as , each with geometric multiplicity of . Therefore coercivity of the bilinear form is equivalent to showing that is symmetric positive definite almost everywhere in .
Assuming that every component of our current guess is strictly positive almost everywhere, we prove positive definiteness of in the following lemma. If a.e. for each and a positive constant , then for any , the matrix is symmetric positive definite almost everywhere.
For almost every , is symmetric positive semidefinite. We proceed with the following argument pointwise. The normalized eigenvectors form an orthonormal basis. By hypothesis the associated eigenvalues can be ordered such that
The nullspace of then consists of the space spanned by the vector . Furthermore,
for a that depends only on .
Given any we can expand it in terms of the basis as
for basis coefficients . Furthermore, by orthonormality,
The matrix defined in (1.19) is also symmetric positive semidefinite, explicitly for
Hence we can also construct a basis of orthonormal eigenvectors. The vector is also an eigenvector of with the eigenvalue . We will identify this eigenvector as . is of rank as it is the outer product of a vector with itself, and hence all other eigenvalues are zero.
Hence for a given we can expand it as
for basis coefficients and calculate
and therefore is positive definite at . This argument can be repeated for every except perhaps on a set of measure zero. Therefore is symmetric positive definite almost everywhere. ∎
It is useful to understand how scales with . This can be achieved by the following scaling argument. Suppose that whenever for each we have the lower bound on the eigenvalues, as in (4.17), of . Now suppose that for any we have for each . We can then define the new variables . We then see that for each . Define the as the transport matrix with these new variables replacing . By direct calculation we can check that
By construction we have that . It follows from (4.24) that . Hence we see that .
From Lemma (4) we have that
For conditions (4.10), given a , we can choose which then yields
Similarly for we have
The final step is that we use either or the condition (3.20) to deduce a Poincaré inequality of the form
for some constant depending only on . Hence