Approximate Message Passing with a Colored Aliasing Model for Variable Density Fourier Sampled Images

The Approximate Message Passing (AMP) algorithm efficiently reconstructs signals which have been sampled with large i.i.d. sub-Gaussian sensing matrices. However, when Fourier coefficients of a signal with non-uniform spectral density are sampled, such as in Magnetic Resonance Imaging (MRI), the aliasing is intrinsically colored. Consequently, AMP's i.i.d. state evolution is no longer accurate and the algorithm encounters convergence problems. In response, we propose an algorithm based on Orthogonal Approximate Message Passing (OAMP) that uses the wavelet domain to model the colored aliasing. We present empirical evidence that a structured state evolution occurs, where the effective noise covariance matrix is diagonal with one unique entry per subband. A benefit of state evolution is that Stein's Unbiased Risk Estimate (SURE) can be effectively implemented, yielding an algorithm with no free parameters. We empirically evaluate the effectiveness of the parameter-free algorithm on a synthetic image with three variable density sampling schemes and find that it converges in over 20x fewer iterations than optimally tuned Fast Iterative Shrinkage-Thresholding (FISTA).



There are no comments yet.


page 3

page 4

page 9

page 10

page 11


An Approximate Message Passing Algorithm for Rapid Parameter-Free Compressed Sensing MRI

For certain sensing matrices, the Approximate Message Passing (AMP) algo...

Universality of Linearized Message Passing for Phase Retrieval with Structured Sensing Matrices

In the phase retrieval problem one seeks to recover an unknown n dimensi...

A Unified Framework of State Evolution for Message-Passing Algorithms

This paper presents a unified framework to understand the dynamics of me...

Approximate Message Passing for orthogonally invariant ensembles: Multivariate non-linearities and spectral initialization

We study a class of Approximate Message Passing (AMP) algorithms for sym...

Sparse Bayesian Learning Using Approximate Message Passing with Unitary Transformation

Sparse Bayesian learning (SBL) can be implemented with low complexity ba...

Compressive Imaging using Approximate Message Passing and a Markov-Tree Prior

We propose a novel algorithm for compressive imaging that exploits both ...

MRI Image Recovery using Damped Denoising Vector AMP

Motivated by image recovery in magnetic resonance imaging (MRI), we prop...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

We consider a complex data vector

formed of noisy Fourier coefficients of a signal of interest :



is a multi-dimensional discrete Fourier transform and

is a diagonal undersampling mask with on the th diagonal entry if and otherwise, where is a sampling set with for . Here, where is the identity matrix and

denotes the distribution with independent real and imaginary parts that are normally distributed with mean

and covariance matrix . A well-studied approach is to seek a solution of


where is a model-based penalty function. Compressed sensing [Donoho2006CompressedSensing, Candes2006RobustInformation] concerns the reconstruction of signals of interest from underdetermined measurements, where sparsity in is promoted by solving Eq. 2 with for sparse weighting and sparsifying transform .

A prominent success of compressed sensing with Fourier measurements is accelerated Magnetic Resonance Imaging (MRI) [Lustig2007, Otazo2010CombinationMRI, Jaspan2015CompressedLiterature, Ye2019CompressedPerspective, Donoho2017HowIndustry], where the signal of interest is a natural image. Images of interest typically have a highly non-uniform spectral density that is concentrated at low frequencies. Accordingly, it is well-known that better image restoration is possible if the sampling set

is generated with variable density, so that there is a higher probability of sampling low frequencies

[Puy2011OnSamplingb, wangVDS, Krahmer2014StableImaging, Chauffert2013VariableStrategies]. This work considers an

with elements drawn independently from a Bernoulli distribution with generic non-uniform probability, so that


1.1 Approximate Message Passing

The Approximate Message Passing (AMP) algorithm [Donoho2009Message-passingSensing.] is an iterative method that, for certain sensing matrices , efficiently estimates in problems of the form At iteration , AMP implements a denoiser on with mean-squared error estimate . For instance, for penalty function , is the proximal operator associated with :


which is equal to soft thresholding in the case of and orthogonal . For certain sensing matrices and given mild conditions on , AMP’s state evolution guarantees that in the large system limit , , vector is the original signal corrupted by zero-mean complex Gaussian noise with a covariance matrix that is proportional to the identity:



is a scalar iteration-dependant standard deviation. When the covariance matrix is proportional to the identity, as in

Eq. 4, the noise and state evolution are referred to in this work as white. The white state evolution of AMP has been proven for real i.i.d. Gaussian measurements in [Bayati2011TheSensing] and i.i.d. sub-Gaussian measurements in [Bayati2015UniversalityAlgorithms]. It has also been shown empirically that it holds for uniformly undersampled Fourier measurements of an artificial i.i.d. signal [Donoho2009Message-passingSensing.]. When state evolution holds, AMP is known to exhibit very fast convergence. However, for generic , the behavior of AMP is not well understood and it has been noted by a number of authors [Rangan2014OnMatrices, Caltagirone2014OnPassingb, Guo2015ApproximateTransformation, Rangan2016FixedMatrices] that it can encounter convergence problems. The recent Orthogonal AMP (OAMP) [Ma2017OrthogonalAMP] and related Vector Approximate Message Passing (VAMP) [Rangan2019VectorPassing] algorithm obey a white state evolution for a broader class of measurement matrices , and were found to perform very well on certain reconstruction tasks. For VAMP, white state evolution was proven for sensing matrices that are ‘right-orthogonally invariant’: see [Rangan2019VectorPassing] for details.

1.2 Colored aliasing

AMP, OAMP and VAMP assume that the entries of

are i.i.d. and that the sensing matrix is sufficiently random to ensure that the aliasing is white. However, when an image is sampled in the Fourier domain the aliasing is innately colored. To see this, consider the natural initialization for an approximate message passing algorithm: the unbiased estimator

, where is the diagonal matrix formed from sampling probabilities . Denoting , the power spectrum of the aliasing of is equationparentequation


where is the entry-wise absolute squared and is the -dimensional vector of ones. Equation Eq. 5 depends on and , which are non-uniform and anisotropic in general. Note that although the specific case for constant does lead to white aliasing, it requires knowledge of the ground truth spectral density so is not feasible in practice.

(a) (a)
(b) (b)
(c) (c)
Figure 1: An unbiased estimate and its error for a uniformly sampled Shepp-Logan with . The colored aliasing evident in Fig. (c)c illustrates the infeasibility of white state evolution for Fourier sampling of signals with non-uniform spectral density.

A visual example of colored aliasing is shown in Fig. 1. Here, the unbiased estimate of a 512x512 synthetic Shepp-Logan uniformly sampled with for all is shown, with . The power spectrum of the aliasing is proportional to the power spectrum of the signal, so is highly colored. Sampling with variable density can partially but not entirely compensate for the power spectrum of the signal.

(a) (a)
(b) (b)
(c) (c)
Figure 2: The ground truth, unbiased estimate and entry-wise absolute error in the wavelet domain for the same sampling set as Fig. 1. For visualization, Fig. (c)c shows the entry-wise logarithm. The anisotropy of the spectral density of

causes the horizontal, vertical and diagonal variances to differ, even at the same scale.

The intrinsically colored aliasing of variable density sampling from a non-uniform spectral density implies that the white state evolution of AMP Eq. 4 is not feasible. The primary development of this work is based on the use of the Discrete Wavelet Transform (DWT) to compute a multiresolution decomposition of the power spectrum of the aliasing [Goossens2011Wavelet-BasedNoise, Johnstone1997WaveletNoise]. In the wavelet domain colored aliasing has a structure that resembles a state evolution. To illustrate this, consider again the unbiased initialization . The corresponding estimator in the wavelet domain is


Fig. 2 shows , and the entry-wise absolute difference for the same image and sampling set as Fig. 1, where is a Haar wavelet with 4 decomposition scales. Fig. (c)c indicates that the power spectrum given by Eq. 5 has a simple structure in the wavelet domain; in particular, each wavelet subband behaves as the ground truth subband corrupted by zero-mean white Gaussian noise. The variance of the Gaussian noise depends on the power spectrum of the aliasing and is different for each subband, even for subbands at the same scale.

1.3 Colored state evolution

Herein we present a new method for undersampled signal reconstruction that we term the Variable Density Approximate Message Passing (VDAMP) algorithm, see Algorithm 1. We present empirical evidence that VDAMP preserves the subband-dependent noise structure illustrated in Fig. (c)c for all iterations. Explicitly, the of VDAMP behaves as


where is an orthogonal DWT and the covariance matrix is diagonal so that for a with decomposition scales


where and refer to the variance and dimension of the th subband respectively. We refer to Eqs. 8 and 7 as the colored state evolution of VDAMP and as the effective noise of VDAMP. To our knowledge, VDAMP is the first algorithm for variable density Fourier sampling of natural images where a state evolution has been observed.

The joint time-frequency localization provided by the wavelet transform decomposes the color of the effective noise while retaining incoherence. In the numerical examples presented in Section 3 a sparse model on wavelet coefficients is considered, however, we emphasize that VDAMP uses the wavelet transform only as a tool for decomposing the aliasing, and is not constrained to models directly on wavelet coefficients [Metzler2016FromSensing, Xue2016D-OAMP:Sensing].

2 Description of algorithm

The VDAMP algorithm is stated in Algorithm 1, where is the entry-wise absolute value of a vector or matrix and is entry-wise multiplication. The effective noise covariance matrix is represented by a vector that models the diagonal of Eq. 7. The function is a denoiser with a colored effective noise model . The notation in line 7 of Algorithm 1 is defined as the function with th entry


where and are the real and imaginary parts respectively. In line 8, the notation is an operator that empirically averages over subbands, so that has the structure




where is the set of indices associated with subband and .

Require: Sampling set , wavelet transform , probability matrix , measurements , denoiser , number of iterations .

1:  Set and compute
2:  for  do
8:     Update
10:  end for
11:  return  
Algorithm 1 Variable Density Approximate Message Passing (VDAMP)

2.1 Weighted gradient descent

To ensure that is an unbiased estimate of , the sensing matrix must be correctly normalized. In VDAMP this is manifest in the gradient step of lines 3-4, which features a crucial weighting by that is absent in previous applications of AMP to variable density sampling [Sung2013LocationMRI, Eksioglu2018DenoisingBM3D-AMP-MRI], where a state evolution was not observed. This provides the correct normalisation in expectation over : , which implies that for all . VDAMP’s is the unbiased estimator from Eq. 6. Such a rescaling is referred to as ‘density compensation’ in the MRI literature [Pipe1999SamplingSolution, Pruessmann2001AdvancesTrajectories], and was used in the original compressed sensing MRI paper with zero-filling to generate a unregularized, non-iterative baseline [Lustig2007].

Scaling by increases the variance of the measurement noise at frequencies sampled with low probability, and its inclusion in the gradient step will lead to a with higher mean-squared error than an unweighted gradient step. However, as shown in Section 3, a careful choice of denoiser that leverages VDAMP’s state evolution can cause lines 6-9 to be very effective, leading to faster overall convergence than competing methods.

The final step of VDAMP, line 11, is a gradient step without a weighting, which generates a biased image estimate with high data fidelity.

2.2 Colored effective noise model

Line 5 of Algorithm 1 computes an estimate of the colored effective noise covariance matrix from Eq. 7. Through a similar calculation to Eq. 5, the power spectrum of the aliasing of is


Using importance sampling with importance distribution , an unbiased estimate of Eq. 12 that does not require knowledge of the ground truth is


The computation of in line 5 of Algorithm 1

is a linear transform of

to the wavelet domain. is the power spectrum of , so has unique columns; line 5 therefore requires inner products. We assume the estimator concentrates around its expectation, and leave the study of the constraints this imposes on for future works. Note that for fixed the complexity of VDAMP is governed by and , whose fast implementations have complexity .

2.3 Orthogonality between errors

This section motivates the form of lines 6-9 of Algorithm 1. Consider the function formed by merging these lines:


In the following we show that when behaves as Eq. 7, this construction establishes approximate orthogonality between error vectors and . Orthogonality between errors is used in OAMP as a non-rigorous alternative to independence between errors, a requirement for state evolution; see [Ma2017OrthogonalAMP] for details.

Consider a random variable

where and is defined in Eq. 8. Since ,


which, by Stein’s Lemma [Stein1972AVariables, Eldar_2009], is


where the factor of arises due to the covariance matrix of the real and imaginary parts of . If VDAMP’s are instances of the random variable , is an estimate of , and the right-hand side of Eq. 16 is approximately zero. Therefore, for large ,


Note that this property holds for arbitrary . Two choices for are discussed in Section 3.2.

3 Numerical experiments

This section illustrates the performance of VDAMP compared with FISTA and SURE-IT and presents empirical evidence for VDAMP’s state evolution.

3.1 Complex soft threshold tuning with SURE

Selecting appropriate regularisation parameters such as is a notable challenge in real-world compressed sensing applications. We present an approach to parameter-free compressed sensing reconstruction that leverages VDAMP’s state evolution by applying Stein’s Unbiased Risk Estimate (SURE) [Stein1981EstimationDistribution], building on work on AMP in [Mousavi2013ParameterlessPassing, Guo2015NearPassing].

A strength of automatic parameter tuning via SURE is that it is possible to have a richer regularizer than would usually be feasible for a hand-tuned . In the experiments presented in this section, the denoiser was the complex soft thresholding operator with a subband-dependant threshold that is tuned automatically with SURE. In other words, SURE with a effective noise model given by was used to approximately solve


where denotes entry-wise division, is the entry-wise square root of and has the structure of Eq. 10.

Equation Eq. 18 was solved using a procedure related to SureShrink [Donoho1995AdaptingShrinkage] but for Gaussian noise that is complex and colored [Johnstone1997WaveletNoise]. Consider a vector corrupted by white complex Gaussian noise: . Let be an estimator of . SURE for complex variables has the form


where is the empirical averaging operator. cSURE is an unbiased estimate of the risk:


Consider the case where is the complex soft thresholding operator with threshold , which acts component-wise as


In this case, Eq. 19 is


In the following experiments the optimal threshold for each subband was estimated with


by evaluating Eq. 22 for trial thresholds . For large dimension

one would expect by the law of large numbers that cSURE is close to the true risk, and for the threshold to be almost optimal. Since a larger number of decomposition scales

give subbands with lower dimensionality, there is a trade-off between the size of and the quality of threshold selection with cSURE.

SURE has previously been employed for parameter-free compressed sensing MRI in [Khare2012AcceleratedParameters], where the Fast Iterative Shrinking-Thresholding Algorithm (FISTA) algorithm [Beck2009AProblems] was used with Eq. 18 in place of the usual shrinkage step. This algorithm is herein referred to as SURE-IT. The iteration-by-iteration error of SURE-IT is highly non-Gaussian, so deviates from a proper theoretical basis for using SURE and does not give near-optimal threshold selection.

3.2 Updating

In the experiments in this section two possibilities for the update in line 8 of Algorithm 1 are considered. First,


In this case lines 6-9 of Algorithm 1 are a colored version of the ‘denoising’ phase of VAMP when written in LMMSE form (see [Rangan2019VectorPassing], Algorithm 3). VDAMP with updated with Eq. 24 is herein referred to as VDAMP-.

Secondly, as in [Xue2016D-OAMP:Sensing], we suggest using SURE for a second time to estimate the that minimises the mean squared error of :


Since , optimizing Eq. 19 reduces to a series of minimisation problems with a closed-form solution, so that for the th subband equationparentequation


Vector is formed from the so that it has the structure of Eq. 10. VDAMP with updated with is herein referred to as VDAMP-S.

3.3 Experimental method

We considered the reconstruction of the Shepp-Logan shown in Fig. (a)a. The undersampled data was artificially corrupted with complex Gaussian noise so that . We assumed that was known a priori. A Haar wavelet at decomposition scales was used. All experiments were conducted in MATLAB 9.4 and can be reproduced with code available at

(a) Uniform sampling
(b) Two-level sampling
(c) Polynomial sampling
Figure 3: Sampling set for uniform, two-level and polynomial sampling. Low frequencies are at the centre of the image, and a white pixel indicates that the Fourier coefficient was sampled.

3.4 Comparison with FISTA and SURE-IT

To establish comparative reconstruction quality and convergence speed, VDAMP- and VDAMP-S were compared with FISTA and SURE-IT. The scalar sparse weighting of FISTA was tuned with an exhaustive search so that the mean-squared error was minimised at . FISTA with a fine-tuned subband-dependent was not considered as it is not feasible in practise. For SURE-IT the mean-squared error estimate was updated using the ground truth: , and Eq. 18 with was employed as the denoiser. All algorithms were initialised with a vector of zeros.

Three variable density sampling schemes were considered:

  1. Uniform sampling with .

  2. Two-level sampling with for a central region and uniform sampling with otherwise, so that .

  3. Polynomial sampling generated using the variable density sampling function from the Sparse MRI package111available at, with .

A visualization of an instance of in each case is shown in Fig. 3.

(a) Uniform,
(b) Two-level,
(c) Polynomial,
Figure 4: NMSE of FISTA, SURE-IT, VDAMP- and VDAMP-S for the three variable density undersampling schemes shown in Fig. 3. The NMSE at differs between algorithms as the image estimate at is defined to be the after the first thresholding is applied.

Fig. 4 shows the normalized mean-squared error (NMSE) versus iteration number for each algorithm for all three sampling sets. VDAMP- and VDAMP-S were run for iterations, and FISTA and SURE-IT were run for 1000 iterations, except in the uniformly sampled case, where FISTA was run for 10000 iterations. All NMSE values were calculated after an additional data consistency step, as in line 11 of Algorithm 1.

In all cases VDAMP- and VDAMP-S converge within 50 iterations, illustrating the benefit of iteration-dependant near-optimal regularization that is made feasible by SURE. The comparatively poor performance of SURE-IT highlights the need for zero-mean Gaussian effective noise for effective automatic parameter tuning.

(d) VDAMP-
(g) FISTA error
(h) SURE-IT error
(i) VDAMP- error
(j) VDAMP-S error
Figure 5: Uniformly undersampled reconstruction at iteration with (b) FISTA (NMSE = -11.9dB) (c) SURE-IT (NMSE = -16.4dB), (d) VDAMP- (NMSE -41.0dB) and (e) VDAMP-S (NMSE -41.3dB). In all cases the absolute value is shown. For visualization (f) shows the logarithm of . Figures (g)-(j) show the entry-wise absolute difference and are represented on the same scale. Note the visible undersampling artefacts of FISTA and SURE-IT. The VDAMP errors are biased because of the additional gradient step in line 11 of Algorithm 1

VDAMP- and VDAMP-S’s reduction in required number of iterations is particularly significant in the uniformly sampled case, Fig. (a)a, where FISTA did not converge even after 10000 iterations. This reflects the highly colored aliasing that arises when the sampling distribution does not compensate for the power spectrum of the signal, and highlights the benefit of modeling the aliasing as colored. Fig. 5 shows , the ground truth Fourier coefficients , and FISTA, SURE-IT, VDAMP- and VDAMP-S reconstructions for uniform sampling at , with entry-wise errors .

For two-level and polynomial sampling the variable density distribution partially compensates for the power spectrum of the signal, so the aliasing is less strongly colored. Figs. (c)c and (b)b show that this leads to a less substantial but still significant improvement in convergence speed for VDAMP- and VDAMP-S. There is an NMSE improvement in VDAMP-S over VDAMP-, which indicates that VDAMP-S’s update successfully leads to a with lower NMSE than VDAMP-. The performance of VDAMP- in these figures also highlights that VDAMP can converge to an estimate with a higher NMSE than FISTA, despite the richer regularization. This is due to the scaling in the gradient step, line 4 of Algorithm 1, which effectively increases the measurement noise for coefficients sampled with low probability.

In our implementation the mean per-iteration compute time over all experiments was 0.15s for FISTA, 0.16s for SURE-IT, 0.18s for VDAMP- and 0.20s for VDAMP-S. The small additional computational cost of SURE-IT over FISTA is due to the calculations required for threshold selection with SURE. The further per-iteration cost of VDAMP is due to the inner products required for the update in line 7 of Algorithm 1, and in the case of VDAMP-S, the inner products required for updating by Eq. 26. Despite the additional per-iteration computational cost, the effectiveness of the denoiser significantly reduces the required computation time. For instance, the time required to reach an NMSE of -35dB in Fig. (a)a was 406s for FISTA, 105s for SURE-IT, 3.6s for VDAMP- and 3.4s for VDAMP-S, and in Fig. (b)b was 12s for FISTA, 11s for SURE-IT, 2.5s for VDAMP- and 2.0s for VDAMP-S.

3.5 Empirical evidence of state evolution

(a) (a)
(b) (b)
(c) (c)
Figure 6: The magnitude of the effective noise for VDAMP-S for a Shepp-Logan undersampled with the polynomial mask of Fig. (c)c for iterations , where the colormaps are on the same scale. The logarithm is not shown, unlike Fig. (c)c.
Figure 7:

Normalized quantile-quantile plots against a Gaussian for three subbands of VDAMP-S’s

for for polynomial sampling with

in blue, and points along a straight line in red. Scale 1 is the finest and scale 4 is the coarsest. The real part is plotted in the top and bottom rows and the imaginary is plotted in the middle row. Linearity of the blue points indicates that that the data comes from a Gaussian distribution, and the decreasing gradient shows that the variance decreases with increasing

. Finite dimensional effects causing small deviations from an exact Gaussian are more apparent at coarse scales, where the dimension is smaller.

This section presents empirical evidence that VDAMP obeys the colored state evolution given by Eq. 7, using the illustrative case of VDAMP-S with polynomial sampling at , as in Fig. (c)c. In Fig. 6, is shown for , indicating that the unbiased subband-dependant effective noise structure shown at in Fig. 2 is preserved. For , Fig. 7 shows quantile-quantile plots against a Gaussian of the three illustrative subbands of : the diagonal detail at scale 1, the horizontal detail at scale 2 and the vertical detail at scale 4, where scale 1 is the finest and scale 4 is the coarsest. The linearity of the blue points provides evidence that the effective noise is Gaussian, so that VDAMP has the effective noise given by Eqs. 8 and 7.

4 Conclusions

Based on the observation that Fourier sampling from a non-uniform spectral density leads to colored aliasing, we propose VDAMP, an algorithm based on OAMP that obeys a colored state evolution. State evolution provides an effective way to tune model parameters via SURE, implying that a single algorithm can be used for arbitrary variable density scheme without the need for manual adjustment. More degrees of freedom are feasibly allowed in the model, enabling higher order prior information such as anisotropy, variability across scales and structured sparsity, without the need to estimate the structure a priori such as in model-based compressed sensing


VDAMP was motivated by the application of compressed sensing to accelerated MRI. Developments are required for VDAMP to be applicable to MRI data, which is acquired across a number of coils that possess a sensitivity profile[Liang2009AcceleratingSensing, Pruessmann1999SENSE:MRI.]. Further developments are required for VDAMP to be applicable to Fourier sampling with 1D readout curves, where elements of are generated dependently.

It is known that the state evolution of OAMP holds for a wide range of denoisers [Metzler2016FromSensing, Xue2016D-OAMP:Sensing]. In [Ahmad2019PlugImaging], a number of standard compressed sensing algorithms that leverage image denoisers designed for Gaussian noise were shown to perform well on MRI reconstruction tasks, despite the mismatch between the aliasing and its model. A sophisticated denoiser equipped to deal with wavelet coefficients corrupted with known colored Gaussian noise would be expected to perform well in conjunction with VDAMP.