Generalized sampling with functional principal components for high-resolution random field estimation

02/20/2020 ∙ by Milana Gataric, et al. ∙ University of Cambridge 0

In this paper, we take a statistical approach to the problem of recovering a function from low-resolution measurements taken with respect to an arbitrary basis, by regarding the function of interest as a realization of a random field. We introduce an infinite-dimensional framework for high-resolution estimation of a random field from its low-resolution indirect measurements as well as the high-resolution measurements of training observations by merging the existing frameworks of generalized sampling and functional principal component analysis. We study the statistical performance of the resulting estimation procedure and show that high-resolution recovery is indeed possible provided appropriate low-rank and angle conditions hold and provided the training set is sufficiently large relative to the desired resolution. We also consider sparse representations of the principle components, which can reduce the required size of the training set. Furthermore, the effectiveness of the proposed procedure is investigated in various numerical examples.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 15

page 17

page 18

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Let be the space of square-integrable complex-valued functions supported on a compact domain , with the standard inner product and norm denoted by and , respectively. Let be a

-valued random field with a probability measure

and let denote one particular realization of . In this paper, we consider the problem of recovering a high-resolution approximation of with respect to the first elements of an orthonormal basis in (e.g. a wavelet basis), from:

  • noisy low-resolution measurements of with respect to the first elements of another potentially different Riesz basis in (e.g. a Fourier basis), namely

    (1)

    where the highest sampled frequency (i.e. sampling bandwidth) is relatively small compared to the desired resolution and is a realization of a random noise, as well as

  • noisy high-resolution measurements of a realization of a random sample from the probability measure , namely training observations that consist of

    (2)

    where is a realization of a random noise.

Specifically, we want to recover in a high-resolution subspace , so that its reconstruction achieves the high-resolution approximation rate , where is the orthogonal projection of onto and thus the best possible approximation of in . It is important to note that normally, such high-resolution rate of approximation cannot be achieved solely from the low-resolution measurements (1) of and typically requires increasing the highest sampled frequency relative to the desired resolution . In this paper, we keep independent of and instead increase the size of the training set relative to , thereby leveraging the implicit statistical information given through the high-resolution training observations (2).

To recover from its low-resolution indirect samples (1), so that the corresponding estimate may achieve the high-resolution rate associated with , in this paper, we estimate the coefficients of with respect to the functional principal components constructed from the high-resolution measurements (2). Specifically, we recover in a reconstruction subspace that is constructed from the first

-dimensional (sparse) eigenvectors of the sample covariance matrix associated with observations (

2) and thus estimates the subspace spanned by the first eigenfunctions ordered by the magnitude of the corresponding eigenvalues, , of the covariance operator associated with the probability measure .

Furthermore, we investigate the conditions under which such estimation can be guaranteed for any realization of and . In particular, we show that, in the case of a Gaussian measure and Gaussian noise, if and are such that the distance between the subspaces and is not too large, then the corresponding estimator of is consistent as . Moreover, if and are sufficiently large, the rate of estimation corresponds to the maximum of the two terms, and , implying that, if is a low rank measure so that is sufficiently small, we can achieve the same rate of estimation as the best possible approximation rate in . Thus, our reconstruction from the low-resolution measurements in can achieve the high-resolution associated with as increases, only at the price of increasing the size of the training set . We also consider a regularization technique based on sparse principal component analysis in order to allow for reduced sample sizes , where sparsity is imposed on the functional principal components with respect to .

1.1 Motivation and relation to previous work

Reconstructing an unknown function from its samples (1) is a classical problem in signal and image processing, where represents an unknown signal or an image that needs to be recovered from the measurements given by a sensing device, see e.g. [Unser, 2000]. If the samples are taken with respect to Fourier exponentials, then such problem arises in medical imaging, such as MRI, as well as in radar and geophysical imaging; whereas, if the sampling system is a pixel basis (the basis induced by the scaling function of Haar wavelets), then such samples correspond to camera based measurements, which is a scenario that arises in lens-less optical imaging for example. In such applications, a pertinent goal is to recover given a small amount of its fixed measurements, which led to a boom of areas such as compressed sensing [Candès et al., 2006, Donoho, 2006] and super-resolution [Blu et al., 2008, Candès and Fernandez-Granda, 2014] over the past decades.

Generalized Sampling (GS) is a reconstruction framework within computational harmonic analysis that by means of least-squares recovers coefficients of an element of a separable Hilbert space with respect to any desired reconstruction basis (or more generally, a frame) in , from its finitely many measurements taken with respect to any other basis in , such as those given in (1), [Adcock and Hansen, 2012, Adcock et al., 2013]. It guarantees a noise-robust reconstruction in , which attains the best possible approximation rate in , provided the distance between the spaces and is not too large. This framework has also been combined with -regularization yielding insights into so-called infinite-dimensional compressed sensing [Adcock and Hansen, 2016, Adcock et al., 2017]. The results therein established that, if is sparse with respect to , then by means of -regularization one still may stably reconstruct even if only randomly sub-sampling in . However, even though by random sub-sampling in the total number of samples can be substantially reduced, the condition on not too large distance between the spaces and remains, meaning that the highest sampled frequency has to be large relative to the desired resolution . In applications such as MRI for example, this may present a time-consuming constraint since, (especially) when under-sampling, high frequencies in the Fourier domain need to be acquired. Also, in applications where fast calibration of an imaging device is crucial for a real-time operation, such as optical endoscopy for example, time-consuming calibration is needed for high-resolution image recovery [Gataric et al., 2019].

Unlike these previous works, in the present paper, instead of reconstructing a generic deterministic -function, we focus on reconstructing a -valued random field with a probability measure whose structure can be learned through a training set. Therefore, we can adapt our sampling scheme more closely to the object being sampled, namely to the specific probability measure at hand, and thereby possibly reduce the highest frequency required for the high-resolution recovery in . In particular, the reconstruction procedure proposed in this paper, which we call GS-FPCA, combines the aforementioned GS framework with the data-driven approach of Functional Principal Component Analysis (FPCA) from functional data analysis, see e.g. [Ramsay and Silverman, 2005, Hall et al., 2006]. By means of FPCA, we construct a suitable reconstruction subspace in from the observations (2), thereby circumventing the requirement on the distance between subspaces and , which is replaced by a condition on the distance between and the space spanned by the first eigenfunctions of the underlying probability measure .

Proposals to use PCA-based priors for increasing image resolution most notably appear within the problem of face hallucination, the term first coined in the work of [Baker and Kanade, 2000]. In particular, [Capel and Zisserman, 2001]

suggest super-resolving a face image by transferring it form a pixel to a eigenface-domain constructed via a training set of high-resolution images, which was then combined with a face recognition task in

[Gunturk et al., 2003]. This technique is also used as the initial step in two-stage super-resolution algorithms that combine a global PCA model with a local patch model, see for example [Liu et al., 2007, Yang et al., 2010]. Those works consider a finite-dimensional setting, which could be deduced from the infinite-dimensional model of this paper by constraining to the -dimensional pixel basis and defining , . Another notable example of exploiting low-rank structure of the underlying signal being recovered appears in acceleration schemes for dynamic MRI [Lingala et al., 2011, Zhao et al., 2012], and more recently for functional MRI [Chiew et al., 2016] and MR fingerprinting [Zhao et al., 2018], where typically a sequence of images over time is reconstructed with respect to the PCs estimated from low spatial resolution measurements with high temporal sampling rate. However, a crucial distinction to the approach presented in this paper is that we consider functional PCs constructed from high spatial resolution measurements so that we can subsequently allow for high-resolution image recovery from its low-resolution Fourier samples.

In contrast to these earlier works, we consider a more general infinite-dimensional framework for high-resolution recovery of an unknown object of infinite resolution, which is sampled via a flexible measurement model allowing for sampling with respect to any Riesz basis, such as a Fourier basis for example. Our framework also allows for sparse representations of the unknown object with respect to different bases, such as wavelets for example, thereby potentially decreasing the required size of the training set. Furthermore, as a result of using an infinite-dimensional framework, we provide insights into the conditions on the problem parameters under which it is possible to guarantee that such a procedure succeeds in high-resolution recovery.

The remainder of the paper is organized as follows. Since in this work we leverage GS and FPCA, we dedicate Sections 2 and 3 to review the main concepts from these frameworks, where in Sections 2.1 and 3.1, we derive additional results used later on. In Section 4, the proposed GS-FPCA method is formulated and its statistical performance is theoretically analyzed with respect to different problem parameters. Additionally, in Sections 4.1 and 4.2

we describe variants of GS-FPCA that arise due to the regularization techniques of sparse PCA and ridge regression. In Section 

5, the empirical performance of GS-FPCA is investigated in different simulation scenarios. Specifically, in Section 5.1 we use a 1D generative model, while in Section 5.2, we use 2D brain-phantom images. In Section 6, we conclude with discussions and future work.

2 Generalized Sampling (GS)

Given measurements of a function with respect to the first elements of a basis or frame in , GS recovers with respect to the first elements of any desired, potentially different basis or frame in by means of least-squares. In particular, if is a desired reconstruction space and is a given sampling space, and if and denote the orthogonal projections of to the respective subspaces, then by results of [Adcock et al., 2013] we have the following reconstruction guarantees: if

then for any there exists a unique reconstruction

(3)

with coefficients defined as the least-square solution to the linear system

(4)

i.e. which satisfies the sharp bound

(5)

Moreover, for any fixed and arbitrarily small , the angle condition is satisfied for any sufficiently large , and thus achieves the best possible approximation rate in up to a constant. Also, the condition number of such reconstruction, which is defined to indicate reconstruction stability to measurement perturbations , , is proportional to . In particular, if is a Riesz basis with Riesz constants such that

(6)

and is an orthonormal basis, then , where

denotes the minimal singular value of the system matrix

in (4), namely , where is the minimal eigenvalue and is the adjoint of . Note that when is an orthonormal basis.

We remark that, alternatively, the angle condition can be interpreted so that for any fixed and , resolution needs to be sufficiently small. As we decrease the number of measurements (or increase ) we also need to decrease resolution (or increase ) so that the angle condition is satisfied, but the rate at which this happens depends on the specific choices of spaces and , and has been analyzed in a variety of settings, see e.g. [Adcock et al., 2014b, Adcock et al., 2014a, Adcock et al., 2019]. In particular, if is spanned by a Fourier basis or frame, it is known that this rate is linear when is spanned by wavelets, and quadratic when is spanned by polynomials.

2.1 Generalized sampling with random noise

In what follows, we need to consider the error bound (5) when the measurements of are perturbed by random noise. To this end, let us assume that the measurements are , where

are i.i.d. Gaussian random variables in

with mean zero and variance

, i.e.  are i.i.d. Gaussian random variables in with mean zero and variance . Let us now define

(7)

where . For simplicity, let be a Riesz basis such that (6) holds and an orthonormal basis, so that we can use and , where is the system matrix in (4), is defined in (3) and . Since for any complex-valued matrix , we have

then by the finite-sample bound for the least squares estimator, see e.g. [Hsu et al., 2012b], for any we have . Therefore, by the triangle inequality, if , then with probability at least , satisfies

(8)

Moreover, similarly to the approach by [Cohen et al., 2013], if we assume a uniform bound on , that is, for a we consider functions such that , and define a truncation operator

(9)

so that we may use , where , then from the high probability bound in (8) we obtain the expectation bound

Furthermore, due to [Mallat, 2008], we know that if is the subspace spanned by the boundary-corrected Daubechies wavelets with

vanishing moments and

is -Holder continuous, , then . Thus, in this case, for and , if and are such that and , then

(10)

3 Functional Principal Component Analysis (FPCA)

If is a random field with probability measure on with mean and covariance , , then by Mercer’s lemma, there exist a non-increasing sequence of non-negative eigenvalues and an orthonormal sequence of eigenfunctions of the covariance operator such that , and such that

(11)

where are uncorrelated random factors with zero mean and unit variance. Moreover, if is a Gaussian field, then are standard Gaussian random variables. Eigenfunctions are also known as functional principal components (FPCs) of and the expression (11) is known as the Karhunen-Loeve (KL) expansion of , see for example [Ramsay and Silverman, 2005]. Such representation of is known to be optimal in the following sense:

(12)

for any , where if and zero otherwise.

3.1 Empirical high-resolution functional principal components

Since in practice we observe only finitely many noisy coefficients of with respect to the first elements of an orthonormal basis , let us now consider the finite-dimensional high-resolution subspace and let denote the orthogonal projection onto . First, consider a -valued random variable , whose mean is denoted by and covariance , with the corresponding eigenfunctions and eigenvalues denoted by and , respectively. If we now define a -valued random variable

we see that its mean vector

is equal to and its covariance matrix satisfies Writing

it then follows that and , . Moreover, if is a Gaussian random field, then is a multivariate Gaussian with mean and covariance , since any finite-dimensional section of a Gaussian process is a multivariate Gaussian.

We can now model the training observations (2) as realizations of i.i.d. multivariate random variables

(13)

where , and are i.i.d. Gaussian on with mean zero and covariance , which are also independent of . If is Gaussian, then are Gaussian with mean and covariance , in which case it is known that the eigenvectors of the sample covariance matrix , where , are consistent estimators of the eigenvectors of as , e.g. [Koltchinskii and Lounici, 2017b]. Moreover, by utilizing the classical results of the Galerkin method, e.g. [Babuška and Osborn, 1987], we can obtain the following high-probability bound on the distance between the space spanned by the eigenfunctions at the population level, , and the space spanned by the high-resolution empirical eigenfunctions, , where .

Lemma 1.

Let be a Gaussian measure on with mean , eigenfunctions and eigenvalues , and let be an orthonormal basis in . For any and , let , , and . For any , let and let be as in (13), and also define , , and . Then

  • there exist and such that for any and with probability at least we have

    provided that ,

  • for any with probability at least we have

The proof of Lemma 1 is given in Appendix A. We now discuss the order of bounds and derived in this lemma, since these play an important role later on. First of all, observe that the order of the second summand in these bounds is and respectively, provided that . Moreover, if is a probability measure on the space of -Holder continuous functions and is the -dimensional space of boundary-corrected wavelets with vanishing moments, then , and therefore, if also , we have

(14)

Note that such bound improves with increasing , provided is also increasing. In particular, if , then we can obtain the -rate of approximation up to , namely for such we have the bound of order .

4 GS-FPCA reconstruction methodology

In this section, we derive and analyze a method for estimating a random field form its measurements taken with respect to the first elements of a Riesz basis in such that Riesz inequality (6) holds, by accounting for the statistical information contained in the observed coefficients of with respect to the first elements of another orthonormal basis in . To this end, we consider the following random variables:

  • , where are i.i.d. Gaussian in with mean zero and variance , which are also independent of . In particular, these random variables yield a realization of the low-resolution measurements in (1).

  • , where are i.i.d. Gaussian in with mean zero and covariance , which are also independent of ’s and ’s. As discussed previously, we model our high-resolution training set (2) as a realization of such random variables.

We define the reconstruction space as

(15)

where and is defined as the th eigenvector of the sample covariance , where denotes the sample mean. Namely,

(16)

where and . Writing , we now propose to estimate in the reconstruction space defined in (15). Specifically, we define the estimator of as

(17)

where the coefficients are the least-square solution to the linear system

(18)

namely

(19)

It is useful to note that, if we denote the random system matrix in (18) by , which takes values in , and the system matrix in (4) by , since and , we have .

When compared to the GS-reconstruction defined in (7), our reconstruction defined in (17) also takes values in , but now the well-posedness of our solution depends on the value of random variable instead of . For a sufficiently large , we can show that our proposed estimator can stably achieve the -rate of approximation provided and are such that is bounded away from zero and is sufficiently small. Specifically, we can show the following.

Theorem 2.

Consider the setting of Lemma 1 and let and be such that , where and is such that (6) holds. Then there exists such that for any , and any -valued random field , estimator defined in (17) satisfies

with probability at least .

The proof of this theorem is given in Appendix