The optimal transportation problem (OT), first proposed by Monge  and later generalized by Kantorovich , has been extensively studied from the theoretical point of view [10, 11, 13, 14, 15, 23, 42, 45]. It has a wide variety of applications in economics [17, 4], fluid mechanics , meteorology [18, 19], image processing [41, 40], transportation [46, 16], and optics design [24, 39]. In this section, we briefly introduce the basic theory and numerical methods of OT, and point out our contribution to the numerical analysis of OT in this paper.
1.1. Introduction to Optimal Transport
If are subdomains of and
are given probability measures, the Monge formulation of OT is to find an optimal mapwhich minimizes the transport cost, i.e.
where the cost function is given. Hereafter, denotes the push-forward of measure through , namely means that for any measurable set , we have , or equivalently
for all continuous functions . Since this problem is difficult to study, and sometimes the optimal map does not exist, Kantorovich  generalized the notion of transport map, and considered the following problem:
where is the set of transport plans between and , namely
Here and are the projections defined as . This definition implies that for and any measurable sets , we have . For and , let be the set of probability measures with bounded moment, i.e.
which clearly contains those probabilities measures with bounded support. If , then for all because
Moreover, if , the Kantorovich problem (1.3) always has a finite minimum value. In fact, this defines the well-known Wasserstein metric on :
In addition, for quadratic cost, i.e. ,  shows that there exists a unique optimal map for a convex function ( is uniquely determined -a.e.) provided that gives no mass to -surfaces of class . If are absolutely continuous measures with respect to Lebesgue measure with densities , then
whence satisfies the generalized Monge-Ampère equation
with second type boundary condition [21, section 4.6] provided that the domains are chosen so that:
The optimal transport map induces an optimal transport plan given by , where denotes the identity map, namely
for all measurable sets . Similar results also hold for any , and for general costs satisfying a twist condition [45, Chapter 10].
1.2. Numerical Methods for OT and Our Contribution
In view of the numerous and diverse applications of OT, developing fast and sound numerical methods for OT is of paramount importance. Several algorithms do exist, ranging from those inspired by PDEs and variational techniques for absolutely continuous measures and [1, 5, 7, 8, 22, 26, 30, 38] to those approximating one or both measures by Dirac masses and then solving the approximate OT [6, 12, 20, 28, 29, 31, 36, 44, 43]. However, intrinsic difficulties make their numerical analysis far from complete.
In this paper, we develop stability and error analyses for quadratic costs that accounts for the effect of approximating measures and by Dirac masses. To this end, we extend the stability estimates of Gigli , originally derived for optimal maps, to optimal plans . We do not develop new techniques to approximate .
If at least one of the two measures is discrete, then it is possible to solve for the exact transport map numerically without further approximations, which includes semi-discrete algorithms [2, 28, 29, 31] and fully-discrete methods [12, 44, 43]. For these methods, since their errors solely come from approximating absolutely continuous measures with discrete measures, our results directly lead to error estimates. We also point out that an error estimate for a semi-discrete method was recently obtained by Berman , but there are no such results for fully-discrete schemes.
We now describe our results. Let be compact sets of and be absolutely continuous measures with respect to the Lebesgue measure with densities , respectively. Let
be approximations of governed by a parameter , which means the Wasserstein distances satisfy
To make this more concrete, we briefly introduce one way to obtain an approximation of . Choose such that , where denotes the open ball with radius centered at . Then consider Voronoi tessellations: let
be the Voronoi cell for and
Notice that we may assume since otherwise for we could just drop the Dirac measure at . Define the map such that for all . It can be easily checked that this map is well defined and satisfies a.e. in because the intersection between and for is of zero Lebesgue measure and ; in particular for all . Therefore is a transport map from to and thus
Similarly, we could approximate the absolutely continuous measure with density and bounded support with satisfying .
The semi-discrete algorithms solve the OT between and upon finding a nodal function such that
where the discrete subdifferential is given by
This type of discretization was introduced by Oliker and Prussner  for Dirichlet boundary conditions whereas error estimates have been derived in [35, 33]. The set coincides with the subdifferential of the convex envelope associated with [35, Lemma 2.1]. The function is piecewise linear and induces a mesh with nodes whose elements may be quite elongated; see [35, Section 2.2]. We also refer to Berman , who has derived error estimates for the second type boundary condition involving .
Denote the barycenter of with respect to the measure by , namely , and define the map such that for all . Under proper assumptions on measures , to be stated in Section 2, we prove a weighted error estimate in creftypecap 4.2 for the exact optimal transport map
This rate of convergence coincides with that in [9, Theorem 5.4] for , but the error notion is different; we refer to Section 4 for details. Our approach is taylored to discrete transport plans and thus extends to fully-discrete methods.
Fully-discrete methods aim to find the discrete optimal transport plan
between and through the constrained minimization problem
If we construct the map from for , then we also obtain a weighted error estimate in creftypecap 5.1 for the optimal map
under suitable assumptions on measures described in Section 2. This is a new error estimate for fully-discrete schemes, and the convergence rate in (1.13) is the same as (1.11) for semi-discrete methods.
In Section 2 we introduce the notion of -regularity and show that it leads to -regularity of the transport map . We prove in Section 3 that regularity implies a stability bound characterizing how the optimal transport plan changes under perturbations of both and ; this hinges on an idea from . We measure the change of transport plans using either a weighted norm in creftypecap 3.5 or the Wasserstein metric in creftypecap 3.9. We apply the stability bound to derive error estimates in Sections 4 and 5. For semi-discrete schemes, we obtain weighted error estimates in creftypecap 4.1 and creftypecap 4.2 of Section 4 with a convergence rate . We also compare our geometric estimate of creftypecap 4.2 with a similar one due to Berman ; however, the two error notions are different. Moreover, we obtain in creftypecap 5.1 of Section 5 an entirely new error estimate of order for fully-discrete schemes with errors measured in both the weighted norm and the Wasserstein distance.
2. Regularity and Nondegeneracy
We now discuss the assumptions we make on measures in order to prove quantitative stability bounds for optimal transport plans and error estimates for numerical methods.
Assumption 2.1 (-regularity).
We say that is regular for if is a convex function such that is the optimal transport map from to and is smooth in the sense that:
for any and , i.e. is convex. We also say is regular if there exists a such that is regular.
We will see below that (2.1) implies ; note that if , then for all . Although we require to be defined in the whole in creftypecap 2.1, this is not restrictive because any convex function defined in a bounded convex set can be extended as in [21, Theorem 4.23]
Now we introduce the Legendre transform of , which is defined by
Since is convex, given we readily get for all
whence in view of (2.2) the following two key properties of are valid:
and if and only if . To see that implies , let be arbitrary and notice that (2.2) for yields
Consequently, if and are of class , then is the inverse of the transport map . Moreover,
provided is –regular. In fact, since , in view of (1.2) we have
We exploit the convexity of and to obtain
The following lemma states that –regularity of a convex function is equivalent to uniform convexity of its Legendre transform [3, Proposition 2.6].
Lemma 2.2 (–regularity vs uniform convexity).
If is convex, then the following statements are valid:
If is smooth, then its Legendre transform must satisfy
for all and , i.e. is convex.
If is convex, then its Legendre transform is smooth.
We exploit that are arbitrary. Taking supremum with respect to , we get
Maximizing the last two terms with respect to , we obtain and
which is the asserted inequality (2.4). That is convex is a straightforward calculation that completes the proof. ∎
Lemma 2.3 (–regularity).
A –smooth convex map is of class and the Lipschitz constant of is , namely
It is now apparent from creftypecap 2.3 that the constant dictates the stability of the transport map ; is also the stability constant that appears in our error estimates of Sections 4 and 5. If , then (2.5) is equivalent to for all . As we have already mentioned at the end of Section 1.1, the optimal map from to for quadratic cost is given by under suitable assumptions on . By Caffarelli’s regularity results [13, 14, 15], if both and are uniformly convex domains of with boundary and the densities of are bounded away from and , then the solution of the corresponding second boundary value Monge-Ampère problem (1.6) is of class . This implies that is regular for some if we extend to . Moreover, the same assumptions imply that is regular for the Legendre transform of and some .
3. Quantitative Stability of Optimal Transport Plans
In this section, we generalize the quantitative stability bounds of Gigli [23, Corollary 3.4] for optimal transport maps, and show some consequences of our theorem under suitable assumptions on measures and . They will be useful in Sections 4 and 5 to derive error estimates. The stability estimate in  is based on bounding the difference between an optimal transport map and another feasible transport map by the difference between their transport costs. Proposition 3.1 below is just a generalization of this important property in that it replaces transport maps by transport plans.
Proposition 3.1 (stability of transport plans).
Let be regular for , and be the optimal transport map from to . Then, for any transport plan , there holds
In view of creftypecap 2.3 (-regularity), the map is of class . It thus follows that for all because . Using
together with creftypecap 2.2 (-regularity vs uniform convexity), namely is convex, we further obtain that
In view of (1.2) the last two terms are equal and cancel out. Consequently,
Rearranging the equation above gives (3.1). ∎
Remark 3.2 (interpretation of (3.1)).
Notice that the right hand side in (3.1) without factor is the difference of transport costs between the given transport plan and the optimal transport map . The factor acts as a stability constant.
Remark 3.3 (stability of transport maps).
We point out that the left hand side of (3.1) can be seen as the square of a weighted -error between the transport plan and the optimal map . To understand this, notice that if the plan in creftypecap 3.1 is induced by a map , i.e. , then (3.1) can be written as
which is the same as [23, Proposition 3.3]. This is an estimate of .
Let us recall the gluing lemma for measures (see [45, p.23]), which plays an important role in proving the triangle inequality of Wasserstein distance in the theory of optimal transport. We refer to [42, Lemma 5.5] for a proof.
Lemma 3.4 (gluing of measures).
Let be probability spaces for , with , and , . Then there exists (at least) one measure such that and , where is the projection defined as .
We use this lemma to glue the measures with their approximations as follows. Given optimal transport plans between and , between and , and between and , we let be a probability measure so that
where and ; creftypecap 3.4 guarantees the existence of . This in particular implies that projections on the coordinate satisfy
along with . Moreover, the following formula holds
for and similar expressions are valid for , and . If , and , then according to (1.5).
Now we state and prove our main perturbation estimates for transport plans.
Theorem 3.5 (perturbation of optimal transport plans).
Let be the optimal transport map from to and be regular for . Let be approximations of , let be two transport plans between and with -errors
and let . If is an optimal transport plan between and , then
Remark 3.6 (interpretation of (3.6)).
We emphasize that, in view of Remark 3.3 (stability of transport maps), the left hand side of (3.6) can be viewed as the square of the -error between the optimal map and the transport plan . The quantity is a measure of the approximation errors between and . A typical choice of is to take the optimal transport plans between and respectively, which implies .
Remark 3.7 (non-uniqueness).
In general, an optimal transport plan between and may be neither unique nor induced by a map, no matter how small