Low-rank approximation for multiscale PDEs

Historically, analysis for multiscale PDEs is largely unified while numerical schemes tend to be equation-specific. In this paper, we propose a unified framework for computing multiscale problems through random sampling. This is achieved by incorporating randomized SVD solvers and manifold learning techniques to numerically reconstruct the low-rank features of multiscale PDEs. We use multiscale radiative transfer equation and elliptic equation with rough media to showcase the application of this framework.



There are no comments yet.


page 1

page 2

page 3

page 4


Manifold Learning and Nonlinear Homogenization

We describe an efficient domain decomposition-based framework for nonlin...

Generalized Rough Polyharmonic Splines for Multiscale PDEs with Rough Coefficients

In this paper, we demonstrate the construction of generalized Rough Poly...

Schwarz iteration method for elliptic equation with rough media based on random sampling

We propose a computationally efficient Schwarz method for elliptic equat...

A reduced order Schwarz method for nonlinear multiscale elliptic equations based on two-layer neural networks

Neural networks are powerful tools for approximating high dimensional da...

A data-driven approach for multiscale elliptic PDEs with random coefficients based on intrinsic dimension reduction

We propose a data-driven approach to solve multiscale elliptic PDEs with...

Learning Green's functions associated with parabolic partial differential equations

Given input-output pairs from a parabolic partial differential equation ...

Rough McKean-Vlasov dynamics for robust ensemble Kalman filtering

Motivated by the challenge of incorporating data into misspecified and m...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Multiscale phenomena are ubiquitous, with applications in many physical sciences and engineering fields: aerospace, material sciences, geological structure analysis, and many other area. The different scales often have different physics, which entangle to produce complicated nonlinearities. Partial differential equations (PDEs) are often used to model these problems, with the different scales captured in the coefficients and functions that define the PDE. These PDE models with are challenging to compute directly, thus analysis and algorithms specifically designed for such multiscale problems have been developed and investigated. Following convention, we focus in this review on problems with two distinct scales, with a parameter

capturing the ratio between the small and large scale.

Though modern multiscale analysis dates back to asymptotic PDE analysis that was already seen in Hilbert expansion or Poincaré expansion early last century (see review in [MR2382139]), the thrive for multiscale computing is largely due to the early efforts in the US Department of Energy (DOE) National Labs within the ASCI (Advanced Strategic Computing Initiative) [Ho:2009multiscale]. Multiscale analysis and methods have been popular since then. During this period, analysis and computation in multiscale PDEs have taken different paths. Analysis of multiscale PDEs follows a single “universal” strategy, as was passed down from the tradition. The equation is decomposed into several levels according to asymptotic expansions involving the scale parameters, with the subequation at each level representing physics at a single scale. Subequations at the finer level feed data to those at the coarser level. This analytical machinery has been used to treat multiscale PDEs arising from such varied backgrounds as kinetic theory, semi-classical quantum systems, and homogenization of composite materials, among others [MR2382139, MR2830582].

On the computational side, strategies for handling multiscale PDEs are more varied. Problems are usually handled by specifically designed solvers. One class of solvers, called asymptotic-preserving schemes, are designed to preserve asymptotic limits of kinetic equations. These schemes usually contain some component of macro-solvers and micro-solvers, integrated in a clever way to reveal different structures in different regimes. Another class of solvers, numerical homogenization methods, usually target elliptic and parabolic equations in which the coefficients that represent media have oscillatory elements. These methods usually consist of offline and online stages, with either the homogenized media or the representative basis functions prepared in the offline stage.

Why are most numerical schemes for multiscale PDEs equation-specific despite the analytical tools being largely unified? This intriguing question has motivated our investigations into devising a universal numerical strategy for solving multiscale PDEs. While the approach is yet to be developed fully, we believe that our progress on this issue is of wide interest, and this article surveys our progress to date. Crucially, our approach exploits the low-rank structure present in discretizations of multiscale PDEs.

To demonstrate the fundamental idea, we consider the following problem:


where is a linear partial differential operator that depends explicitly on the small parameter , while represents the boundary condition or the source term, which is assumed to have no dependence on . Multiscale problems that can be formulated in this way include elliptic equations with highly oscillating media and the neutron transport equation with small Knudsen number. Due to the -dependence of the operator, the solution inherits structures at both fine and coarse scales.

An asymptotic limit is revealed by multiscale analysis using asymptotic expansions as . In this limit, the oscillation at the fine scale is fast and the detailed oscillation pattern no longer matters — only macroscopic quantities are relevant. Formally, writing the homogenization limit as


we have


The norm of the approximation error depends heavily on the particular equation at hand.

The numerical challenge in solving (1

) is that many degrees of freedom may be needed. Naïve finite element or finite difference methods would require mesh size

to resolve fine-scale structure of the solution at the level. For a problem on , the discretized system will have degrees of freedom, leading to prohibitive computational and memory cost for small . From an application perspective, it often suffices to characterize the solutions on the macroscopic level. At this level, oscillations at the scale are largely absent. This property raises the question of whether we can obtain an approximate solution of this type using only degrees of freedom. If we know how to derive (2), we can simply solve for , which has the required macroscopic properties, and typically requires a distribution with degrees of freedom. Often, though, the limiting equation (2) and its solution are difficult to find explicitly, even when it is possible to establish their existence. These difficulties have led researchers to see problem-specific solutions.

We believe that a universal approach for finding the large-scale solution can be devised, and that exploitation of the low-rank structure of the solution space is the key to doing so. The low-rank property is shared by all multiscale problems that have asymptotic limits, as we depict in Figure 1. The Green’s matrix (the discrete version of the list of Green’s functions on fine grids) requires dimensions to represent the underlying Green’s function accurately. However, as the argument above already suggests, this system can be well-represented by the dimensions of

, the limiting Green’s matrix. These points suggest that the numerical solver can be “compressed” as well. In the language of numerical linear algebra, this transition amounts to performing truncated singular value decomposition (SVD) of

to obtain .

Figure 1: Low rank of multiscale PDE systems and their discretizations with small parameters.

If we obtain the truncated SVD of the matrix by starting with a full SVD, the resulting algorithms would be impractical because of the large dimension of the matrix and the expense of preparing and storing the full matrix

and of computing its SVD. Several new linear algebra solvers take a quite different approach. Instead of accessing the full matrix, these new solvers merely require matrix-vector products to be computed, involving this matrix and several randomly selected vectors (typically vectors with Gaussian i.i.d. entries). Translated to PDE solver setting, these matrix-vector multiplications amount to computing numerical solutions to the PDEs with some random source terms, a task that may be practical if the number of such operations required is modest. The randomized SVD (rSVD) approach is one method of this type. It is equipped with a thorough analysis and achieves optimality in terms of computational efficiency. We make use of this method in the techniques described in the remainder of this article.

The main theme of our article, then, is the use of randomized SVD solvers to exploit the low-rank features of multiscale PDEs. We will describe two strategies both of which are divided into “offline” and “online” stages. The offline stage sees the preparation of either the solution space or the boundary-to-boundary map used in the domain decomposition, while the online stage singles out the specific solution for the given source . The two strategies are described in Section 4.1 and 4.2, respectively. In Section 5, we present the nonlinear extension utilizing manifold learning algorithms for reconstructing the low-rank features of the solution manifold. Prior to these discussions, we describe in Section 2

two algorithm classes — asymptotic preserving and numerical homogenization — for identifying the asymptotic limits of multiscale problems. As examples, we use the multiscale radiative transfer equation (RTE) and the elliptic equation with rough media. Section 

3 explores the two main elements of our approaches: the numerical low-rank feature of multiscale PDEs and the randomized SVD solver for efficient reconstruction of low-rank operator/spaces. We conclude with a discussion of future work in Section 6.

2 Examples

Kinetic equations and elliptic equations with oscillating media are two examples of multiscale PDEs, for which computational scemes were developed separately. The specific features of these problems were incorporated into the design of asymptotic preserving schemes and numerical homogenization methods, respectively. We review these techniques and highlight the shared low-rank property of these two problems.

2.1 Kinetic equations and asymptotic preserving methods

Kinetic equations, which originate from statistical mechanics, describe the evolution of probability density for identical particles in phase space. A model equation, the radiative transfer equation (RTE), characterizes the evolution of photon density. In the steady state, this equation reads


where is the light source, and the linear collision operator describes the interaction of photons with the optical media. The small parameter is encoded in this operator.

The operator defines several distinct regimes. In the optically thick regime, it is defined by


In this case, is the scattering coefficients that describes the possibility of a photon located at changing its velocity from to , and the parameter is called the Knudsen number, standing for the ratio of the mean free path to the typical domain length. When the medium is optically thick, the mean free path is small, with . This means the photon particles are scattered fairly often, and the system statistically achieves the equilibrium state, which can itself be characterized mathematically. By performing asymptotic expansion in terms of , the inhomogeneity in the velocity domain vanishes, and one can show asymptotically approximates , a function without dependence on that solves a diffusion equation. We have the following result from [MR2839402].

Theorem 1.

Suppose that solves (4) with collision term being isotropic, that is for some function . Let be bounded with smooth boundary, and . Assume that the boundary condition is




where solves


with the boundary condition

where solves a proper boundary layer equation and can be obtained from .

This result indicates that the homogenized operator as is . Similar results, when fails to have the form of in the anisotropic optical media, are still available, but the explicit form of is no longer available.

A second regime of interest for is one in which the media is highly heterogeneous [MR1760042]:


In this case, the photons go through the media that oscillates at a small scale: For example, sunlight passing through heavy cloud with a large number of small droplets or laser beam passing through crystals. If only macroscopic quantities are observed, the far pattern of frequent scattering is unobservable, and the media can be approximated by an effective media that is homogeneous. By adapting a theorem from [MR1760042], we have the following.

Theorem 2.

Let the conditions from Theorem 1 hold, and suppose that the collision term is defined in (9). Then


where solves


where for some .

In special cases, such as under periodic or random ergodic conditions, the function can be computed explicitly. (There are also works that investigate the asymptotic limit of the RTE when the system is both highly oscillatory and in diffusion regime; see [MR1878799].)

In both limiting regimes, the limiting equations (8) and (11) can be solved much more efficiently than the original equation (4). The discretization of (4) is constrained strongly by , due either to stability (as in (5)) or accuracy (as in (9)). By contrast the solution varies smoothly, containing no -scale effects, so can be obtained accurately by applying a discretization with mesh width to the asymptotic limiting equation. If the latter equation is available, computation of by this means is the recommended methodology.

Methods for kinetic equations are termed “asymptotic preserving” (AP) if they can relax the requirement that the discretization width satisfies

yet still capture the asymptotic limits. Many different AP approaches have been proposed. For linear equations, existing AP methods rely on even-odd or micro-macro decomposition. For nonlinear equations, knowledge of the specific forms of the limits is usually required, and this knowledge is built into the solvers 

[MR3645390]. As mentioned above, these specific forms are often not available, so many AP methods cannot be applied to a large set of multiscale kinetic equations. This observation begs the question: Knowing the existence of the limit, but not its particular form, can we still devise efficient methods for solving kinetic equations?

2.2 Elliptic equations and numerical homogenization

Another class of multiscale equations that has been investigated deeply is elliptic equations with highly oscillatory coefficient. Once such formulation is


where is the scale on which the media oscillates. (The source term has no small scale contribution.)

The equation is a model problem from petroleum engineering where it is crucial to precompute the underground flow before expensive construction of infrastructure takes place [MR2801210]. The problem is typically solved on kilometer-scale domains, but the heterogeneities in the media can scale at centimeters. Certain forms of this equation can be approximated effectively by an equation that can be solved efficiently. Suppose the media coefficient has the form , that is, it varies on two scales ( and ), and moreover is periodic with respect to the fast variable (the second argument in ). Then in the limiting regime as , the solution converges to that of a homogenized equation, with the media “smoothed-out,” as described in the following result [MR1185639].

Theorem 3.

Let solve (12) in the domain with zero boundary condition. Suppose is periodic with respect to the second argument. Then


where solves the following effective equation with zero boundary condition:


where , the effective media, can be computed from a cell problem (See Definition 2.1 in [MR1185639]).

As in the previous section, when a limiting equation can be derived explicitly, the best course for obtaining a useful solution it to solve this equation directly, as the mesh width in the discretization scheme can be much larger than . See [MR2477579] for a discussion of a reduced number of basis functions and [MR2830582] for computation of the effective media.

However, the validity and the specific form of the effective limit are known only in special cases like the one described in Theorem 3. In other cases, we seek a solver that relies on as little analytical knowledge as possible. An approach known as numerical homogenization has been investigated extensively. This approach is founded on two principles: a discretization scheme independent of , and a numerical solution scheme that captures the true limiting behavior of the solution on the discrete level. Variants of numerical homogenization include application of the -matrix, a purely algebraic technique [MR3445676]; a Bayesian approach that views the source , and hence the solution

, as Gaussian fields, which further translates to game theory

[MR3971243]. All these methods are successful, but they all implicitly rely on properties of the underlying elliptic equation. Can we devise an approach that applies to general problems with oscillatory media that exploits the low-rank property in the solution space, without using analytical structure explicitly?

3 A unified framework for multiscale PDEs based on random sampling

We have given several examples of multiscale models that arise in applications, and mentioned several algorithmic approaches that make use of the limiting equations, when available. We describe next the foundations of a unified scheme that captures asymptotic limiting behavior automatically, even when the asymptotic limits are unavailable. Our method exploits low-rank structure and uses random sampling to discover this structure. We describe the low-rank property in Section 3.1 and the randomized SVD method for revealing this structure in Section 3.2.

3.1 Numerical rank

We consider a bounded linear operator , which maps to a space , that is

In the PDE setting, is the solution operator that maps the boundary conditions and/or source term to the solution . The numerical rank of such an operator is defined as follows.

Definition 1 (Numerical rank).

The numerical -rank of is the rank of the lowest-rank operator within the -neighborhood of , that is,

In other words, is this smallest dimension of the range among all the operators within distance of .

When is the PDE solution map, then with low rank is also a linear map with a finite dimensionality. It can be viewed as the discrete version (or a matrix of size ) that approximates within accuracy. The definition suggests that if can be found, it is optimal in the sense of numerical efficiency. The concept is rather similar to the Kolmogorov -width, defined as follows.

Definition 2 (Kolmogorov -width).

Given the linear operator , the Kolmogorov -width is the shortest distance from its range to all -dimensional space, that is,


Indeed, the Kolmogorov -width and numerical rank are related by the following result [MR4155236].

Proposition 1.

For any linear operator , we have the following.

  1. If the numerical -rank is , then .

  2. If , then the numerical -rank is .

For the three examples presented in Section 2, the numerical ranks can be calculated from their limiting equations. For one-dimensional RTE in the diffusion regime, if we denote and the solution operator of (4) and (8), respectively, then noting that can be approximated using grid points to achieve accuracy, when , the numerical rank is naturally . Without employing the knowledge of the existence of the limit, however, a brute-force discretization naturally requires degrees of freedom: for the upwind discretization in and for the discretization in , where depends on the particular numerical integral accuracy. Translating into Green’s-matrix language, this observation means that is represented by degrees of freedom but its range can be captured by a compressed Green’s matrix with just column vectors.

The same argument applies to the elliptic equation on a two-dimension domain with high oscillations. When second-order linear finite elements are used, with no knowledge of the limiting system, degrees of freedom are required, dropping to when the the limiting system is known. In other words, the full Green’s matrix requiring degrees of freedom can be well-represented using just column vectors.

In all these cases, the degrees of freedom for a given numerical method are substantially larger than the numerical rank of the problem. Thus, much of the information in these full-blown representations is redundant and compressible. A low rank representation exists and yields a much more economical representation.

3.2 Random sampling in numerical linear algebra

Knowing the existence of the low rank structure and finding such a structure are very different goals. The Kolmogorov -width is a concept developed in but has made little impact in numerical PDEs for a simple reason: Traditional PDE solvers require a predetermined set of basis functions, while the Kolmogorov -width looks for “optimal” basis functions. How can an optimal basis be found without first forming the full basis? Translated to linear algebra, this question is about finding the dominant singular vectors in a matrix without forming the whole matrix. Specifically, if is known to be approximately low rank, meaning that there exists , a matrix with orthonormal columns with and

can we find without forming the full matrix ?

In linear algebra, it is well-known that is simply the collection of the first singular vectors of . Writing


where and contain the left/right singular vectors and

contains the singular values, then

is the first columns in .

The standard method for computing the SVD requires to be stored and computed with. But the celebrated randomized SVD (rSVD) method [MR2806637] captures the range of a given matrix by means of random sampling of its column space, which requires only computation of matrix-vector products involving and random vectors — operations that can be performed without full storage or knowledge of . Implementation of rSVD is easy and its performance is robust.

The idea behind the algorithm is simple. If matrix has approximate low rank , the matrix maps an -dimensional sphere to an -dimensional ellipsoid that is “thin:” of its axes are significantly larger than the rest. With high probability, vectors that are randomly sampled on the -dimensional sphere are mapped to vectors that lie mostly in a -dimensional subspace of — the range of . An approximation to can be obtained by projecting onto this subspace.

A precise statement of the performance of randomized SVD is as follows [MR2806637].

Theorem 4.

Let be an matrix. Define


where is a matrix of size

with its entries randomly drawn from an i.i.d. normal distribution, where

is an oversampling parameter. If , then the projection of onto the space spanned by , defined by

yields that with high probability, and

The result reconstructs the range of in a nearly optimal way. It is optimal in efficiency because to capture a rank- matrix, only matrix-vector products involving are required for the calculation of , and the oversampling parameter is typically quite modest. ( is a typical value.) The result is nearly optimal in accuracy as well. The error bound relies only on , which is expected to be smaller than . The decay profile of singular values do not affect the approximation accuracy.

If a low-rank approximation to the matrix is required, and not just an approximation of its range, another step involving multiplications with its transpose is needed. The full method is shown in Algorithm 1.

1:Given an matrix , target rank and oversampling parameter ;
2:Set ;
3:Stage A:
4:     Generate an Gaussian test matrix ;
5:     Form ;

     Perform the QR-decomposition of

7:Stage B:
8:     Form ;
9:     Compute the SVD of the matrix ;
10:     Set ;
11:Return: .
Algorithm 1 Randomized SVD

4 Random sampling for multiscale computation

Here we describe how rSVD can be incorporated into multiscale PDE solvers to exploit the low-rank structure of these equations. Our procedure is composed of both offline and online stages. Low rank structure is learned in the offline stage, while in the online stage, the solution for the given source / boundary term in (1) is extracted.

We consider in particular the following boundary value problem:


where is the boundary condition operator, the boundary associated with domain , and we now denote the source term (the boundary data) by . Our fundamental goal is to construct the low-rank approximation to the Green’s operator for (18). With this operator in hand, the solution can be computed for any value of the boundary conditions at a relatively small incremental cost.

If we apply rSVD to approximate directly, we need to compute products of this operator with random vectors. For the problem (18), this operation corresponds to solving the problem with replaced by random boundary conditions. Even to solve one such problem efficiently is a computationally challenging task. We use the domain decomposition framework.

We start by partitioning the domain into subdomains as follows:


where the patches overlap, in general. We denote by the boundary associated with . Furthermore, we identify the subregions that intersect with as follows:

and define the interior of the patch to be

For this particular partition of the domain, we define the partition-of-unity functions , to have the following properties:


We choose a discretization that resolves the small scales in the solution, defining a mesh width . (The number of subdomains is independent of .) A typical decomposition is illustrated in Figure 2.

Figure 2: Illustration of domain decomposition of a rectangular domain in 2D with overlap.

How do we design the offline stage to “learn” the low-rank approximation? We propose two approaches that lead to two different kinds of algorithms. In the first approach we learn the optimal basis functions within each subdomain, while the second algorithm employs Schwarz iteration, preparing the boundary-to-boundary map in the offline stage. Other PDE solvers that utilize randomness can also be found in [MR3477310, MR3824169, MR4050504].

4.1 Learning basis functions

In standard domain decomposition, the local discretized Green’s matrix is assembled from a “full” collection of basis functions in the patch . The global solution to (18), confined to each , is a linear combination of the columns . The coefficients of these combinations are chosen so that that the continuity conditions across patches and the exterior boundary condition are all satisfied. The complete process can be outlined as follows.

  1. Offline stage: For , find

    where each local function is a solution to (18) restricted to the subdomain , with fine grid and delta-function boundary conditions. That is,


    where is the Kronecker delta that singles out the -th grid point on the boundary .

  2. Online stage: The global solution is

    with the support of each confined to , where is a vector of coefficients determined by the boundary conditions and continuity conditions across the patches.

The complete basis represented by has a low-rank structure that can be revealed using randomized SVD. Instead of using delta functions as the boundary conditions, we propose to obtain basis functions by setting random values on , as follows:


where is defined to have a random value drawn i.i.d. from a normal distribution at each grid point in . Denoting , we have from linearity of the equation that

where is a random i.i.d. matrix with entries . This is used in the online stage, as an accurate surrogate of , see Algorithm 2.

1:Domain Decomposition
2:     Partition domain according to (19).
3:Offline Stage:
4:     Prepare i.i.d. Gaussian vectors on each .
5:     Solve the basis function in (22) on each , and collect the local basis in .
6:Online Stage:
7:     Use continuity condition and global boundary data to determine coefficient vectors , and set
8:Return: approximate global solution .
Algorithm 2 A general framework for multiscale PDE over with on

Although we do not apply full-blown rSVD here, the homogenizable and low-rank property of the local solution space implies that and share similar range with the number of basis functions in being much smaller than , the number of grid points on , and independent of . In Figure 3 we plot the angles between and for two of the equations discussed in Section 2. In both cases, and for small , the approximated Green’s matrix quickly recovers the true Green’s matrix as the number of samples increases, and thus captures the local solution space.

In Figure 4 we showcase the basis functions on a patch for elliptic equation with media coefficient , with . For this non-conventional media without any periodic structure, the traditional multiscale methods are no longer valid, but our method still quickly captures the optimal basis.

Figure 3: Angle between the true Green’s matrix and the approximate version , confined on the interior of a patch for some , as the number of random samples increases. The plot on the left shows results for 1D RTE (4) with diffusive kernel (5) and various values of . The plot on the right shows the angle computed for elliptic equation (12) on a rectangular local patch with .
Figure 4:

Optimal basis functions and their projections onto the approximate spaces. First row plots the first two singular vectors of the Green’s matrix on a local patch. Second row plots their projection onto the space spanned by 6 random sampled basis functions. Note that the small random sample captures well the leading eigenvectors of the true Green’s operator.

For particular boundary conditions , the global solution is assembled from the local basis functions in the online stage. Two numerical examples are shown in Figure 5. In both examples, there is little visible difference between the reference solution and the approximated one computed from the reduced random basis. Only 8.3% and 62.5%, respectively, of the degrees of freedom required by the full basis are needed to represent these solutions using a random basis; see details in [MR4155236].

Figure 5: The top row shows solutions for 1D RTE (4) with Henyey-Greenstein kernel in (5) with . The bottom row shows solutions for elliptic equation (12) with Dirichlet boundary condition and highly oscillatory medium with . The left column shows reference solutions while the right column is obtained from randomized reduced bases.

Since we do not have access to the full set of basis functions, the condition that defined in (23) is continuous across subdomains can be satisfied only in a least-squares sense; see [MR4155236] for details.

4.2 A low-rank Schwarz method

Our second approach for exploiting the low-rank property in multiscale computations is based on Schwarz iteration. The Schwarz method is a standard iteration algorithm within the domain decomposition framework, in which boundary-value problems are solved on the patches, with neighboring patches subsequently exchanging information and re-solving until consistency is attained. The exchange of boundary information between neighboring patches is known as the boundary-to-boundary (BtB) map. The map has an exploitable low-rank property.

To develop the approach, we write the solution of (18) as


The solution on patch is uniquely determined by , its local boundary condition, according to the equation


The Schwarz method starts by initial guesses to the local boundary conditions , then on iteration , it solves the subproblems (25) with to obtain all local solutions . By confining on the boundaries of adjacent patches, one updates the boundary conditions for surrounding patches:


Here denote the solution to (25) confined in the interior of , and takes the trace of the solution on the neighboring boundaries for , for the updated boundary condition.

Define the BtB map by , and define and to be the aggregation of and , respectively, over . We can then write the updating procedure as

The overall method is summarized in Algorithm 3.

1:Given total iterations ;
2:Domain Decomposition
3:     Partition domain according to (19).
4:Schwarz Iteration:
5:     Initialize for each and set  .
6:     While      
7:         Solve (25) for using for each  .
8:         Update on ,  .
9:         .      
10:     End
11:     Solve (25) for using for each  .
12:     Assemble global solution  .
13:Return: approximated global solution .
Algorithm 3 Schwarz method for multiscale PDE over with on

Most of the computation in the Schwarz method during the iteration comes from solving the boundary-value PDEs on the patches, to implement the map . Since the PDE is homogenizable, the solution space on each patch is approximately low rank, and the map can be expected to inherit this property. If we can “learn” this operator in an offline stage, and simply apply a low-rank approximation repeatedly in the online stage, the online part of Algorithm 3 can be made much more efficient. In our approach, Algorithm 1 is used to compress the map .

This approach is quite different from the one described in Section 4.1, in the sense that it is not only the range of the solution space, but the whole operator that is being approximated. To apply Algorithm 1, we need to define the “adjoint operator” for on the PDE level. This operator is composed of the adjoints for the local solution operators of (25) on each domain . The form of is specific to the PDE; we use the elliptic equation as an example. Defining , is defined in the following result.

Theorem 5.

Let be the confined solution operator for the elliptic equation with Dirichlet boundary condition on patch . Given any function supported on , the adjoint operator acting on is given by:


where is the outer normal derivative on and solves the following sourced elliptic equation:


We also describe calculation of the adjoint operator for the RTE (4).

Theorem 6.

Let be the confined solution operator for RTE (4) and the conditions in Theorem 1 hold. Given any function supported on , the adjoint operator is defined as follows


where solves the adjoint RTE over , which is


with outgoing boundary condition on and .

The specific form of the adjoint operator allows us to adapt the randomized SVD algorithm to compress the confined solution map ; see Algorithm 4. This method requires only solves of local PDE (25) and sourced adjoint PDE (28) (or (30)), together with a QR factorization and SVD of relatively small matrices. The overall low-rank Schwarz iteration is then summarized in Algorithm 5.

1:Given target rank and oversampling parameter ;
2:Set ;
3:Stage A:
4:     Generate random boundary conditions on .
5:     Solve (25) using as boundary conditions and restrict the solution over to obtain .
6:     Find orthonormal basis of .
7:Stage B:
8:     Construct zero extension of over , denoted by .
9:     Solve (28) or (30) for using as source.
10:     Compute using by flux (27) or restriction on incoming boundary (29).
11:     Assemble all fluxes .
12:     Compute SVD of .
13:     Compute .
14:Return .
Algorithm 4 Randomized SVD for

In Figure 6 we present the confined solution operator and its low-rank approximation . For the radiative transfer equation, the map is a 2880-by-40 matrix, with the size of each patch being 0.2[-1,1], with and . The random sampling procedure reconstructs it with just 6 samples. For the elliptic equation, the map is a 1600-by-160 matrix, and the size of each patch is 1[0,1] with . The random sampling approximates it well with 60 samples. The compression rates for these examples are thus 6.7 and 2.7, respectively. See details in [MR4252068].

In Figure 7, we show numerical examples for the global solutions of two problems obtained from the approach of this section. The reference solution (obtained with a fine mesh) is well captured by the approximation that uses the low-rank BtB map as the surrogate in the Schwarz iteration. These two cases use just and , respectively, of the number of local solves needed to capture the BtB map at fine scale. While the relative error of the reduced Schwarz method decays as fast as the standard Schwarz iteration, as shown in Figure 8, the cost is much reduced. See Table 1 for a comparison of computation times.

1:Given rank , total iterations .
2:Domain Decomposition
3:     Partition domain according to (19).
4:Offline Stage:
5:     For all , use Algorithm 4 to find the rank- approximation to , denoted by .
7:     Initiate for each , and set .
8:     While      
9:         Evaluate for each .
10:         Update on , .
11:         .      
12:     End
13:     Solve (25) for using for each  .
14:     Assemble global solution .
15:Return .
Algorithm 5 Reduced Schwarz method for multiscale PDE over with on
Figure 6: The singular decay of the restricted local solution operator and its low rank approximation for the RTE (left) and the elliptic equation (right). In RTE (4) we use heterogeneous collision kernel (9) , and in elliptic equation we use with .
Figure 7: The comparison between the reference solutions (left column) and the approximation using reduced Schwarz method (right column). The top and bottom rows are for RTE and elliptic equation respectively. In RTE (4) we use heterogeneous collision kernel (9) , with , and in elliptic equation we use with .
Figure 8: Relative error for reduced Schwarz methods for various ranks for elliptic equation with oscillatory medium that is the same as Figure 7.
RTE offline (s) online (s)
Rank = 5 227.99 0.0029
Rank = 6 268.46 0.0162
Full rank 706.99 0.0148
Schwarz 1027.40
elliptic offline (s) online (s)
Rank = 40 49.7 0.049
Rank = 70 87.3 0.061
Schwarz 31.4
Table 1: Run time comparison between vanilla Schwarz method and the reduced Schwarz method. for the RTE and for the elliptic equation with Dirichlet boundary condition. The configuration of the media is the same as those in Figure 7.

5 Manifold learning and nonlinear multiscale problems

It is not straightforward to extend the techniques of the previous section to nonlinear PDEs. Despite low-rank properties still holding due to the existence of the limiting equation, the argument based on compressing the Green’s matrix no longer holds. The collection of solutions for different source / boundary terms is not longer a linear subspace, but a solution manifold.

We consider the general nonlinear multiscale problem in the following form


where is a nonlinear differential operator that depends explicitly on the small parameter . The term can be the source term, boundary conditions or initial conditions. Assume further that the equation has an asymptotic limit


as , that is, as . The argument for the linear problem is still applicable: The degrees of freedom required by the classical numerical method for solving (31) grows rapidly as , while the existence of the homogenized equation (32) indicates that only degrees of freedom should be needed to resolve macro-scale features. From a manifold perspective, the solutions to (31) vary in a high dimensional space as changes, but this manifold is approximated to within distance by another manifold whose dimension is .

Suppose a manifold in a high dimensional space is approximately low-dimensional, can we quickly learn it without paying the high dimensional cost? We turn to manifold learning for answers to this question. We are particularly interested in adopting the ideas from the local linear embedding [RoSa:2000nonlinear]

and multi-scale SVD approaches that learn the manifold from observed point clouds, and interpolate the local solution manifold using multiple tangent-space patches.

We denote the nonlinear solution map of (31) by , which maps the source term or initial / boundary conditions to the solution of the equation. In the offline stage, we randomly sample a large number of configurations in , and compute the associated solutions on fine grids. These solutions form a point cloud in a high dimensional space . The ’s are then subdivided into a number of small neighborhoods, and we construct tangential approximations to the mapping on each neighborhood. In the online stage, given a new configuration , we identify the small neighborhood to which it belongs and find the corresponding solution by performing linear interpolation. The overall offline-online strategy in summarized in Algorithm 6. We stress that some modifications are needed to reduce the cost of implementation. For example, the algorithm should be combined with domain decomposition (for example, Schwarz iteration) to further confine the computation to local domains, to save computational cost.

In Figure 9 we plot the local low-dimensional solution manifold for a nonlinear RTE (specifically, a linear RTE nonlinearly coupled with a temperature term). The solution manifold appears to have a local two-dimensional structure; the point clouds lie near on a two-dimensional plane. We refer to [chen2020manifold] for more details of the implementation and numerical results.

2:     Randomly sample , , and find solutions .
3:Online: Given :
4:     Step 1: Identify the -nearest neighbors of , call them , with being the nearest neighbor;
5:     Step 2: Compute with
where is a set of coefficient that fits with a linear combination of , for .
6:Return .
Algorithm 6 Manifold learning algorithm for solving
Figure 9: The plot shows the point cloud and its fitting plane for a 1D nonlinear RTE with Knudsen number when it is confined in a small patch containing an interval . The solution profile is approximately determined by the temperature at two grid points and . The -axis shows the value of . The dependence is clearly linear and two-dimensional.

6 Looking Forward

We have seen a vast literature addressing all aspects of the computation of multiscale problems. Over the years, the research has been drifting gradually away from its origin, where solvers were influenced by analytical understanding, specifically of the limiting behavior of the specific PDE. Machine learning algorithms have shown more and more power in sketching the solution profile with a much reduced numerical cost. In particular, the existence of the homogenized limit suggests there are low-rank features in the discrete system, and that random linear algebra techniques and manifold learning methods, when utilized properly, can identify these features for a compressed representation of the PDE solutions.

We have reviewed two methods, both of which make use of the domain decomposition framework. They compress either basis functions or the boundary-to-boundary map in an offline learning stage. This review article serves as a showcase of the power of random solvers in numerical PDEs. For time-dependent problems, and homogenization problems that have weak-convergence (instead of strong) such as quantum systems in the semi-classical regime, further development of the approaches is needed. Incorporation of time and weak-limit in the algorithm-design lies at the core of future challenges.

7 Acknowledgements

The authors acknowledge generous support from NSF, ONR (program manager Dr. Reza Malek-Madani), DOE, and AFOSR. Due to restrictions on the allowed number of references, many important contributions are omitted. The reader is invited to consult the bibliographies in the cited references for a more extensive view of the literature.