In this work we consider the inverse problem of estimating a background fluid flow from partial, noisy observations of a dye, pollutant, or other solute advecting and diffusing within the fluid. The physical model considered is the two-dimensional advection-diffusion equation on the periodic domain :
is a passive scalar, typically the concentration of some solute of interest, which is spread by diffusion and by the motion of a (time-stationary) fluid flow . This solute is “passive” in that it does not affect the motion of the underlying fluid.
is an incompressible background flow, i.e., is constant in time and satisfies .
is the diffusion coefficient, which models the rate at which local concentrations of the solute spread out within the solvent in the absence of advection.
We obtain finite observations subject to additive noise , i.e.
for some measure related to the precision of the observations. Here, the forward map associates the background flow , sitting in a suitable function space , with a finite collection of measurements (observables) of the resulting solution of (1.1). We consider spatial-temporal point observations:
The goal of the inverse problem is then to estimate the flow from data . The initial condition is assumed to be known, so the problem can be interpreted as a controlled experiment, where the solute is added at known locations and then observed as the system evolves to investigate the structure of the underlying flow. This is a common experimental approach to investigating complex fluid flows; see, for example, [12, 13, 35, 29].
As we will illustrate, the inverse problem is ill-posed, i.e., the flow is not uniquely defined by the scalar field ; that the observations of are both finite-dimensional and polluted by noise exacerbates this problem. We therefore adopt a Bayesian approach to regularize the inverse problem, as described for this problem in our companion work  (see also ) and in a more general setting in [11, 30, 4]
. A key component of this approach is the selection of a prior probability measure on the space of divergence-free flows,. It is then natural to ask to what extent the result of the inference depends on the choice of prior, and in particular whether the Bayesian approach to the inverse problem is consistent: That is, under what conditions does the posterior measure concentrate on the true fluid flow as the number of observations of grows large?
In this work, we establish conditions under which the Bayesian inference ofgiven data (1.2) is consistent for i.i.d. observational noise . We then prove that the posterior measure converges weakly to a Dirac measure centered on the true background flow as the number of scalar observations grows large; see Section 3 for a full statement of the assumptions and the key result. Here it is a nontrivial task to determine suitable conditions on the structure of the observed data and on the prior measure for which consistency would be expected to hold. As such, as a crucial starting point for the analysis of consistency, one must address difficult experimental design questions.
In our problem, even under the noiseless and complete measurement of , essential symmetries can prevent the recovery of . For example, a poor choice of in (1.1) makes it impossible to distinguish between (an infinite class of) laminar flows, so multiple experiments (initial conditions) are required to guarantee resolution of the true background flow. A second useful structural condition is that, by picking spatial-temporal observation points at random, we can ensure a sufficiently complete recovery of the solution as the number of observation points grows. Thirdly, it is worth emphasizing that we require special conditions on the prior measure. Crucially, we identify a tail condition that ensures that flows are sufficiently smooth – that is, the prior turns out to be critical to the result by restricting consideration to flows of limited roughness (up to a region of low probability).
An important outcome of this experimental design is that it allows us to use compactness to effectively constrain the space of possible divergence-free velocity fields. Indeed, compactness plays an important role in two components of the consistency proof. First, we use it to show the continuity of the inverse map from to (see Section 4
). Second, we use it to develop a suitable uniform version of the law of large numbers in order to show that noisy observations can differentiate between the true and other scalar fields (Section 5).
Consistency of Bayesian estimators has been of interest since at least Laplace , with rigorous proofs of convergence for some problems appearing in the mid-twentieth century [6, 18]. The works [8, 28, 5] identified infinite-dimensional examples where Bayesian estimators are not consistent – that is, there are cases where the data can never guarantee recovery of the true parameter value. See, e.g., , , or  for a more detailed description on the history of consistency and the main ideas.
In recent years, there has been interest in extending these consistency results to infinite-dimensional inverse problems, and in particular those constrained by PDEs. Our result is one of the first on consistency in this context. Recent work in this area includes , which used an elliptic PDE as the guiding example, and 
, which establishes a Bernstein-von Mises theorem – consistency, but also contraction rates in the form of a Gaussian approximation – for Bayesian estimation of parameters of the time-independent Schrödinger equation.
It is worth noting that the related inverse problem of estimating the drift function from partial observations of the Itô diffusion
has been studied extensively; see, e.g.,  or . Consistency has been established in various forms for this problem; see [32, 14, 24, 1]. However, while the equations (1.1) and (1.4) are related by the Kolmogorov equations (see, e.g., [25, Chapter 8]), the observed data are different: Observations of an individual diffusion provide an approximate measurement of the drift, whereas observations of the concentration are less direct – movement of individual particles must be inferred. Our consistency proof therefore, while retaining some similarities with other such arguments, requires an original approach with different assumptions.
The remainder of the paper is organized as follows. Section 2 describes the mathematical framework of the inverse problem and why it is ill-posed in the traditional sense. The main result and key assumptions are stated in Section 3. Continuity of the inverse map is shown in Section 4. Uniform convergence of the log-likelihood is shown in Section 5. Convergence of the posterior to the inverse image of the true scalar field is shown in Section 6. Finally, the proof of the main result is in Section 7. Energy estimates for the advection-diffusion problem used to show continuity of the forward and inverse maps are reserved for Appendix A.
In this section, we describe the mathematical framework of the inverse problem (1.2
). We begin by defining the functional analytic setting for the problem, including how we represent divergence-free background flows. We then define the inverse problem, key notation, and Bayes’ Theorem for this application.
2.1 Representation of Divergence-Free Background Flows
The target of the inference is a divergence-free background flow , so we start by describing the space of such flows that we will consider. For this purpose we begin by recalling the Sobolev spaces of (scalar valued) periodic functions on the domain
where to ensure . Throughout what follows we fix our parameter space as
Notation 2.1 (Parameter space, ).
Here the exponent
is chosen so that vector fields in, as well as their corresponding solutions , exhibit continuity properties convenient for our analysis below (see creftype 2.4 below). We take with for the usual Lebesgue spaces and denote the space of continuous and -th integrable, -valued functions by and , respectively, for a given Banach space . All of these spaces are endowed with their standard topologies unless otherwise specified.
2.2 Mathematical Setting of the Advection-Diffusion Problem
In this section, we provide a precise definition of solutions for the advection-diffusion problem, (1.1). Crucially the setting we choose yields a map from to and then to observations of that is continuous.
Proposition 2.2 (Well-Posedness and Continuity of the solution map for (1.1)).
Fix any and with and suppose that and . Then there exists a unique such that
so that in particular
solves (1.1) at least weakly, namely
for all and almost all time .
For any the map which associates and to the corresponding is continuous relative the standard topologies on and .
This result can be proven using energy methods; similar results can be found for example in [7, 19]. In the case of smooth solution where one may also establish creftype 2.2 using particle methods as in e.g.  by observing that (1.1) is the Kolmogorov equation corresponding to a stochastic differential equation with the drift given by ; see  for details in our setting. For completeness, we provide the a priori estimates leading to creftype 2.2 in Appendix A.
Definition 2.3 (Solution Operator , Observation Operator ).
The observation operator measures point observations defined by for and .
We now note assumptions on and under which these observations are well-defined and vary continuously with .
Remark 2.4 (Continuity of ).
Let with associated exponent (see (2.3)) and let , for . Recalling that embeds continuously in in dimension (see e.g. , Theorem A.1) we have that again with the embedding continuous. Thus, with creftype 2.2, we have that
continuously. In particular this justifies that is well defined and continuous in the case of point observations as in creftype 2.3.
2.3 Bayesian Setting of the Inverse Problem
In this subsection, we define the setting of the statistical inverse problem and note cases where the inverse map is ill-posed, which will inform the assumptions required for the consistency argument. We close with a definition of Bayes’ theorem for this problem. We begin by fixing some notation used in the remainder of the paper.
Definition 2.5 (, , , ).
We frequently fix a “true” background flow by . For the given , the observed data is given by
The forward map with for observation point .
The observational noise is distributed as .
We emphasize, however, that is not necessarily the only that could produce such data, as we describe in the next remark.
Since the background flow enters (1.1) through the term, the inverse problem of recovering from can be ill-posed. One important class of examples illustrating this difficulty arises when is zero everywhere, in which case the fluid flow does not have any effect on . Two such examples are as follows:
Ill-posedness: Laminar Flow: Let be independent of and . Then for any .
Ill-posedness: Radial Symmetry: Set and . Then for any .
In these cases, the even noiseless and complete spatial/temporal observations of have no way to discriminate between a range of background flows, making it impossible to uniquely identify a true background flow in general.
Theorem 2.7 (Bayes’ Theorem).
3 Statement of the Main Result
With the mathematical preliminaries in Section 2 in hand, we are now ready to provide a precise formulation of the main result of the paper. Referring back to creftype 2.6 we do not expect consistency to hold without delicate assumptions on the initial conditions in (1.1) and on the observation points in our forward function in (1.2). Moreover our result relies on the selection of an appropriate prior . In particular this should distinguish the regularity of the ‘true’ background flow for which we assume there is greater degree of spatial smoothness than for generic elements in the ambient parameter space . We therefore define an additional smaller space used throughout.
Definition 3.1 (Higher Regularity Space).
Define the space
where is the exponent associated with the parameter space defined according to (2.3). We denote for the associated norm and take
i.e. the ball about of radius in the -norm.
Our main result is as follows
Theorem 3.2 (Convergence of Posterior to a Dirac).
Let be a sequence of observation points that we assume are i.i.d. uniform random variables in . Fix any , with determined from (2.3), such that
Define the parameter-to-observable (forward) maps for and the initial conditions by
for . As in creftype 2.5, we fix any and draw data points , where
for i.i.d. observational noises that are independent of the observation points .
Fix a prior distribution and for observations, let be the Bayesian posterior measure on , given by (cf. creftype 2.7)
where is the normalization
|for any , .||(3.7)|
Additionally assume that there exists an such that is monotone increasing with and
Then almost surely. In other words, on a set of full measure,
Remark 3.3 (Sufficient conditions on the prior).
for some . Under this assumption we have
so that (3.10) implies (3.8). Thus we can guarantee the existence of a class of non-trivial priors such that creftype 3.2 holds. On the other hand the reverse implication is not to be expected to hold and thus the general significance of (3.8) for the admissible classes of are not immediately clear. In particular having bounded support is a strong restriction and indeed we conjecture that there is a class of Gaussian measures on such that (3.8) still holds. We will investigate this question in future work.
Remark 3.4 (Poincaré inequality, support of ).
Remark 3.5 (Restrictions on the initial conditions).
It unavoidable that that we impose a condition as in (3.3) on the initial data in creftype 3.2. In creftype 2.6 we provide two examples where the observations have no way to discriminate between a range of background flows. In these two examples as well as many other classes of initial conditions, the posterior fails to concentrate on as the number of observations (except for very particular priors). It is an interesting question for future work to characterize the support of the limiting measure for the analogue of as as a function of a single initial condition .
Before turning to the technical details let us provide an overview of the method of the proof of creftype 3.2. Our starting point are two basic observations. Firstly, according to Portmanteau’s Theorem, in order to establish (3.9) it is equivalent to show that
for any . See e.g.  for further details on such generalities concerning the weak convergence of probability measures.
Invoking the law of large numbers, using assumed statistical properties of and we have
for all sufficiently large.111Referring back to Section 2.1 we are assuming that is unit length. For , take
Invoking (3.13), we observe that
Here note that (cf. creftype 3.4) so that we are not dividing by zero in the final upper bound.
We address both of these concerns by assuming a little bit of extra regularity for our ‘true vector field’ taking and by making effective use of the prior to identify this regularity for (see assumptions (3.7), (3.8)). With the Rellich-Kondrachov theorem we are thus able to use ‘compactness’ to address both concerns. Indeed although an injective, continuous map does not have a continuous inverse in general, this property does hold true when the domain of is compact; see Footnote 2 below. Regarding the second concern (ii) we establish a uniform version of the LLN creftype 5.1 below (and see also [21, 22]) but our proof makes essential use of the fact that the ‘parameter’ (which for us is ) lies in a compact set.
The precise proof of creftype 3.2 is presented in a series of sections as follows. Firstly in Section 4 we address the injectivity of the forward map under (3.3) as well as continuity of the inverse map (i). In Section 5 we introduce a uniform version of the Law of Large Numbers, creftype 5.1 and use this Proposition to obtain a quantitative version of (3.13). Section 6 establishes that converges on the ‘true value’ of as . Finally Section 7 uses the machinery now in place to complete the proof of 3.2.
4 Continuity of Inverse Map
In this section, we lay out conditions under which the inverse solution map is continuous. This requires some care. Indeed it is not true in general that the forward map is injective as illustrated in creftype 2.6. As such, counterexamples to creftype 3.2 exist (cf. creftype 3.5) if we fail to impose a suitable assumption on the initial condition(s) for (1.1) a la (3.3).
With this in mind we now define the solution map associated with the solution of (1.1) for the multiple initial conditions.
Notation 4.1 (Paired solution map ).
Corollary 4.2 ( continuous).
The paired solution map (see creftype 4.1) is continuous.
Lemma 4.3 ( injective).
Let satisfy (4.1), i.e.,
Then for almost all and . However, since both solutions are continuous (see creftype 2.4), this implies that for all and . Denote . Then solves both
for , all and . Subtraction leads to
for and all . In particular,
for and all . However, under (3.3) span at almost all . Therefore for almost all and hence , completing the proof. ∎
Even under the conditions of creftype 4.3 it remains unclear if has a continuous inverse. To remedy this we recall the following elementary fact from real analysis suggesting we restrict further restrict the domain of .
Let be metric spaces and suppose that is compact. Let be injective and continuous. Then is also continuous.222Here, we denote .
Let such that . Define according to and . We would like to show that as .
To this end let be any subsequence. Since compact, there exists a subsubsequence that converges in ; denote this limit . Then, since continuous, . But, by definition and the assumed convergence of , we also have so that . Since injective, , i.e., . However since the original subsequence was arbitrary we have in fact that yielding the desired result. ∎
From Footnote 2 we draw the following two conclusions, which we use below
Corollary 4.5 ( continuous).
Let and . For all , there exists a such that
5 Concentration of Normalized Potentials, Uniform Law of Large Numbers
The next step in our analysis is to prove a rigorous and more quantitative version of (3.13), creftype 5.2, which yields the asymptotics of the potential functions (log-likelihoods) appearing in the posterior measures defined as in (3.6). As a preliminary step we introduce a uniform version of the Law of Large Numbers. See also [21, 22] for previous related results.
Proposition 5.1 (Uniform Law of Large Numbers).
Let be a metric space with compact and (Borel) measurable. Take be an i.i.d. sequence of random variables and let to be any random variable with this distribution. Assume that
and that there exists a deterministic function with such that for all and , there exists a such that
Note that, since is non-negative, implies that
is a set of full measure in which case the random functions , are all constant on and the result (5.3) follow in this special case.
We turn to the nontrivial case where . Define , . Then by our assumptions on , for every . Note also that for any , ,