Infusing Learned Priors into Model-Based Multispectral Imaging

09/20/2019 ∙ by Jiaming Liu, et al. ∙ Washington University in St Louis 0

We introduce a new algorithm for regularized reconstruction of multispectral (MS) images from noisy linear measurements. Unlike traditional approaches, the proposed algorithm regularizes the recovery problem by using a prior specified only through a learned denoising function. More specifically, we propose a new accelerated gradient method (AGM) variant of regularization by denoising (RED) for model-based MS image reconstruction. The key ingredient of our approach is the three-dimensional (3D) deep neural net (DNN) denoiser that can fully leverage spationspectral correlations within MS images. Our results suggest the generalizability of our MS-RED algorithm, where a single trained DNN can be used to solve several different MS imaging problems.



There are no comments yet.


page 2

page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Multispectral (MS) imaging systems acquire the response from an object or a scene over a wide range of frequency bands, including optical, infra-red, and short-wave infra-red [1, 2]

. Multi-band spectra provide rich information for detecting and classifying materials, especially those that have similar visible colors. Additionally, the higher transmission property of infra-red bands, when compared to optical bands, makes the MS imaging beneficial in imaging through haze or fog. As a result, multispectral or even hyperspectral imaging techniques have gained popularity in a number of important applications in astronomy, agriculture, and medical imaging 

[3, 4, 5, 6].

While MS imaging has a great potential, acquiring and processing corresponding data is challenging due to various hardware limitations and the high-dimensional nature of typical datasets. In particular, the resolution and signal-to-noise ratio (SNR) of MS images is often constrained by limitations on the size, weight, and power of sensors used for data acquisition. Image reconstruction methods have been developed for mitigating such hardware limitations by using advanced priors on the unknown MS image. Traditional priors used in MS image reconstruction include sparsity-based regularizers, low-rank models, and dictionary-learning-based priors [7, 8, 9, 10, 11]. Recently, however, the focus in the field has been shifting towards deep learning techniques that are based on learning the direct mapping from the measurements to the recovered MS image [12, 13, 14, 15]. However, the conceptual simplicity of end-to-end learning comes with the loss of modularity, characterized by the lack of an explicit separation between the measurement model and the prior. This limits the generalization and reuse of previously trained models.

In this paper, we develop a new MS image reconstruction method called multispectral regularization by denoising (MS-RED) that enables the infusion of deep neural net (DNN) priors while maintaining an explicit separation between the prior and the measurement model. By building on the recently developed RED framework [16, 17, 18, 19], our method specifies the learned prior only via a 3D spatiospectral denoising function, trained for the removal of additive white Gaussian noise (AWGN) from MS images. MS-RED naturally leverages both spatial and spectral correlations in the data, while also explicitly enforcing fidelity to the measured data. We discuss the convergence of MS-RED under a set of transparent assumptions on the data-fidelity and the denoiser, and develop its fast variant based on the accelerated gradient method (AGM). We finally demonstrate our MS-RED algortihm on MS superresolution using several denoising priors, including those based on DNNs. Our results illustrate that a single trained DNN denoiser can provide a state-of-the-art performance while solving multiple inverse problems without additional retraining.

Figure 1: We propose a new multispectral regularization by denoising (MS-RED) algorithm for recovering a spatiospectral data-cube from its noisy measurements. Our algorithm solves this ill-posed problem by combining the measurement model with a learned spatiospectral denoiser that maximally exploits all the available prior knowledge.

2 Proposed Method

2.1 Inverse Problem Formulation

We consider the problem of acquiring a MS data cube, consisting of two spatial dimensions and one spectral dimension. The cube is represented in a vectorized form as

. We additionally model the MS image acquisition process as


where represents the acquisition model, is a collection of measurements generated by our imaging system (where potentially ) and is noise.

Since the recovery of from in (1) is ill-posed, it is common to formulate it as a regularized optimization problem


where is the data-fidelity term that accounts for the measurement model and is the regularizer that infuses the prior information. For example, when the noise in (1) is AWGN, one often sets . On the other hand, sparsity-based regularization is obtained by setting , where is a some spatiospectral transform and is the regularization parameter [7, 11].

1:input: , , and
2:for  do
4:   where
6:end for
Algorithm 1
1:input: , , , and
2:for  do
4:   where
6:end for
Algorithm 2 -

Accelerated gradient method (AGM), summarized in Algorithm 1, is a well-studied iterative method, effective in large-scale setting, for solving (2) when is a differentiable function [20, 21]. It can be further extended to circumvent the need to differentiate by using proximal operators [22, 23, 24]. Note that the switch between the standard gradient method (GM) and AGM in Algorithm 1 is controlled via the sequence . When for all , we obtain GM with convergence rate; when


the algorithms becomes AGM with convergence. Here, we extend the traditional GM/AGM approach by including a learned denoising function that characterizes a MS image prior.

2.2 Multispectral Regularization by Denoising

In the spirit of AGM, we consider Algorithm 2

, which iteratively estimates

by combining the usual gradient of data-fidelity term with an operator , where is the regularization parameter and is a spactiospectral denoiser. When the algorithm converges, it converges to the vectors in the zero set of the operator

Consider the following two sets

where is the set of all critical points of the data-fidelity and is the set of all fixed points of the denoiser. Intuitively, the fixed points of correspond to all the vectors that are not denoised, and therefore can be interpreted as vectors that are noise-free according to the denoiser.

Note that if , then and is one of the solutions of RED. Hence, any vector that is consistent with the data for a convex and noiseless according to is in the solution set. On the other hand, when , then corresponds to a tradeoff between the two sets, explicitly controlled via . This explicit control in the strength of regularization is one of the key differences between RED and an alternative plug-and-play priors (PnP) framework [25, 26, 27].

The key benefits of our MS-RED approach is in its explicit separation of the measurement model from the prior, its ability to accommodate powerful spatiospectral denoisers (such as the ones based on DNNs) without differentiating them, and their excellent recovery performance as corroborated by our simulations in Section

2.3 Convergence Analysis

Under some conditions on the denoiser, it is possible to relate to some explicit regularization function . For example, when the denoiser is locally homogeneous and has a symmetric Jacobian [16, 17], the operator corresponds to the gradient of


An alternative justification of RED for another explicit function is possible by considering minimum mean squared error (MMSE) denoisers [28, 17]. In this paper, we perform a fixed point analysis of MS-RED for denoisers that do not necessarily correspond to any regularizer . Our analysis complements the analysis in [17] by developing an explicit worst-case convergence rate. We require three assumptions that together serve as sufficient conditions for convergence.

Assumption 1.

The operator is such that . There is a finite number such that the distance of the initial to the farthest element of is bounded, that is

This assumption is related to the existence of the minimizers in the traditional optimization literature [20]. The next two assumptions rely on Lipschitz continuity of the data-fidelity and the denoiser.

Assumption 2.

The function is smooth and convex. It has a Lipschitz continuous gradient with constant .

This is a standard assumption used in gradient-based methods, including for the GM and its proximal extensions [20].

Assumption 3.

The denoiser is Lipschitz continuous with constant .

Since the proximal operator is nonexpansive [29], it automatically satisfies this assumption. Spectral-normalization can be used to make DNNs satisfy this assumption [30].

Theorem 1.

Run MS-RED for iterations under Assumptions 1-3 using a fixed step-size and with . Then, we have


A proof of the theorem is provided in Section 5. Theorem 1 establishes that the iterates of MS-RED can get arbitrarily close to with rate. The proof relies on the monotone operator theory [31, 32], widely used in the context of convex optimization [29].

The analysis in Theorem 1 only provides sufficient conditions for the convergence of MS-RED. As corroborated by our numerical studies in Section, the actual convergence of MS-RED is more general. It often holds with the Nesterov’s acceleration in eq. (3) and beyond nonexpansive denoisers. This suggests that some denoisers might be locally nonexpansive over the set of input vectors used in testing. On the other hand, DNNs can be made globally nonexpansive via spectral-normalization [30], which results in provable convergence.

Figure 2: Illustration of the convergence of MS-RED for three supperresolution factors, namely , , and . The normalized distance to is averaged over all test images and plotted against the iteration number. The DNN used in this experiment is the residual 3D-DnCNN with Lipschitz constant LC = 2. We observed the general stability of MS-RED for our denoiser across all simulations.

3 Numerical Experiments

In this section, we provide several simulations validating our theoretical analysis on multispectral image supper-resolution (MSISR). Generally, the low-resolution (LR) image can be modeled by a blurring and subsequent down-sampling operation on a high resolution one. We test several image denoisers, including 3DTV, BM3D, and our own DnCNN on MS images from TokyoTech database [33]. Originally, those images are of size 50050031 with the wavelengths in the range of 420720 nm at an interval of 10nm. However, for convenience, we restrict our simulations to 6 of 31 bands to build 500500

6 tensors. We implement two typical image degradation settings for MSISR, namely blurring by Gaussian kernel of size

with and blurring by motion kernel of size  [34], both followed by down-sampling with scale factors 2, 3 and 4. For both kernels, the measurements are corrupted with AWGN corresponding to input signal-to-noise ratio (SNR) of 40 dB.

To take advantage of the ability of DNNs to automatically learn the prior information on a specific image set, we train a residual DNN denoiser denoted as DnCNN, which is a simplified version of the popular DnCNN denoiser [35]. DnCNN

consists of seven layers with three different blocks. The first block is a composite convolutional layer, consisting of a normal convolutional layer and a rectified linear units (ReLU) layer. It convolves the

input to features maps by using 64 filters of size . The second block is a sequence of 5 composite convolutional layers, each having 64 filters of size . Those composite layers further processes the feature maps generated by the first part. The third block of the network, a single convolutional layer, generates the final output image with six channels by convolving the feature maps with a

filter. Every convolution is performed with a stride

, so that the intermediate feature maps share the same spatial size of the input image. Moreover, the global Lipschitz constant of the DNN is controlled to be LC = 2 via spectral-normalization  [30], which provides a necessary condition for nonexpansiveness of the denoiser (i.e., for ). We choose four MS images for testing, namely Butterfly, Fan, Cloth, and Doll. Three MS images are used for validation, and the rest twenty eight MS images are used for training. We further expand the training set by adopting patch-wise training and using standard data augmentation strategies. We train three variants of DnCNN for the removal of AWGN, each using a separate noise level from . For each experiment, we select the denoiser providing the highest SNR performance. The BM3D denoiser was applied in a separable fashion along each MS band. The parameter of BM3D was also fine-tuned for each experiment from the set .

Fig. 2 illustrates the convergence of MS-RED by plotting the per-iteration normalized distance to averaged over all the test images. Table 1 summarizes the SNR values for different denoisers averaged over all the images. Fig. 3 visually illustrates the performance by forming pseudo-color images (composed of 22nd, 2nd, and 17th bands) of the results with the scale factor 2 (top) and 3 (bottom). The RED-DnCNN achieves the best SNR performance while also delivering reliable convergence across all instances.

Kernel Scale 3DTV BM3D DnCNN*
24.01 24.67 25.20
22.74 23.33 23.43
20.41 20.94 21.03
Motion blur
23.30 24.49 25.21
20.17 20.85 21.01
18.86 19.34 19.43
Table 1:

Summary of average SNR performance of different methods for multispectral image super-resolution.

Figure 3: Visual illustration of the reconstructed Doll and Butterfly images from TokyoTech dataset, obtained using MS-RED with 3DTV, BM3D, and denoisers. The DnCNN denoiser achieves the best performance by maximally exploiting spatiospectral features in the data.

4 Conclusion

We have presented the MS-RED algorithms for reconstructing MS images from their noisy measurements. The algorithm relies on the integration of a 3D spatiospectral denoiser with the measurement model for maximally exploiting all the available information. In the case of convex data-fidelity terms and nonexpansive denoisers, MS-RED is provably convergent with the rate . However, our experiments suggest that the convergence of MS-RED extends to AGM and beyond nonexpansive denoisers. Therefore, MS-RED is applicable to a large class of MS problems and our future work will include possible extensions to different inverse problems.

5 Proof of the Theorem

A fixed-point convergence of averaged operators is well-known under the name of Krasnosel’skii-Mann theorem (see Section 5.2 in [31]) and was recently applied to the analysis of PnP [27] and several RED algorithms in [17, 19]. Our analysis here extends these results to MS-RED by providing explicit worst-case convergence rates.

In the proof below, we use the shorthand notation and . We consider the following operators

and proceed in several steps by relying on fundamental results from monotone operator theory (Section 5.2 in [31]).

  • [leftmargin=*]

  • Since is -Lipschitz continuous, we know that it is -cocoercive. Hence, the operator is nonexpansive.

  • From the definition of and the fact that is nonexpansive, we know that is nonexpansive.

  • We know that a convex combination of nonexpansive operators is also nonexpansive, hence we conclude that


    is nonexpansive. This implies that that is -cocoercive.

  • Consider any and a single iteration of MS-RED . We then have


    where we used , the cocoercivity of , and the fact that .

  • By rearranging the terms and averaging over iterations, we obtain

  • From eq. (6), we know that the operation

    is nonexpansive for any . Hence, by using this nonexpansiveness, we conclude that

    which together with (8) directly leads to the result in (5).


  • [1] R. M. Willett, M. F. Duarte, M. A. Davenport, and R. G. Baraniuk, “Sparsity and structure in hyperspectral imaging,” IEEE Signal Process. Mag., vol. 31, no. 1, pp. 116–126, Jan. 2014.
  • [2] G. R. Arce, D. J. Brady, L. Carin, H. Arguello, and D. S. Kittle, “Compressive coded aperture spectral imaging,” IEEE Signal Process. Mag., vol. 31, no. 1, pp. 105–115, Jan. 2014.
  • [3] J. Solomon and B. Rock, “Imaging spectrometry for earch remote sensing,” Science, vol. 228, no. 4704, pp. 1147–1152, 1985.
  • [4] Y.-R. Chen, K. Chao, and M. S. Kim, “Machine vision technology for agricultural applications,” Comput. Electron. Agr., vol. 36, no. 2-3, pp. 173–191, Nov. 2002.
  • [5] E. Hege, D. O’Connell, W. Johnson, S. Basty, and E. Dereniak, “Hyperspectral imaging for astronomy and space surviellance,” Proc. Imaging Spectrometry IX, vol. 5159, pp. 380–391, 2004.
  • [6] G. Lu and B. Fei, “Medical hyperspectral imaging: a review,” J. Biomed. Opt., vol. 19, no. 1, 2014.
  • [7] H. Othman and S.-E. Qian, “Noise reduction of hyperspectral imagery using hybrid spatial-spectral derivative-domain wavelet shrinkage,” IEEE Trans. Geosci. Remote Sens., vol. 44, no. 2, pp. 397–408, Feb. 2006.
  • [8] A. S. Charles, B. A. Olshausen, and C. J. Rozell, “Learning sparse codes for hyperspectral imagery,” IEEE J. Sel. Topics Signal Process., vol. 5, no. 5, pp. 963–978, Sept. 2011.
  • [9] C. Lanaras, E. Baltsavias, and K. Schindler, “Hyperspectral super-resolution by coupled spectral unmixing,” in Proc. IEEE Int. Conf. Comp. Vis. (ICCV), Santiago, Chile, Dec. 2015, pp. 3586–3594.
  • [10] C. I. Kanatsoulis, X. Fu, N. D. Sidiropoulos, and W.-K. Ma, “Hyperspectral super-resolution: A coupled tensor factorization approach,” IEEE Trans Signal Process, vol. 66, no. 24, pp. 6503–6517, Dec. 2018.
  • [11] Y. Wang, J. Peng, Q. Zhao, Y. Leung, X.-L. Zhao, and D. Meng, “Hyperspectral image restoration via total variation regularized low-rank tensor decomposition,” IEEE J. Sel. Top. Appl. Earth. Obs. Remote Sens., vol. 11, no. 4, pp. 1227–1243, Apr. 2018.
  • [12] J. Hu, Y. Li, X. Zhao, and W. Xie, “A spatial constraint and deep learning based hyperspectral image super-resolution method,” in Proc. IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA, July 2017, pp. 5129–5132.
  • [13] Z. Xiong, Z. Shi, H. Li, L. Wang, D. Liu, and F. Wu, “HSCNN: CNN-based hyperspectral image recovery from spectrally undersampled projections,” in Proc. IEEE Int. Conf. Comp. Vis. Workshops (ICCV WKSHP), Venice, Italy, Oct. 2017, pp. 518–525.
  • [14] B. Wen, U. S. Kamilov, D. Liu, H. Mansour, and P. T. Boufounos, “DeepCASD: An end-to-end approach for multi-spectral image super-resolution,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Process. (ICASSP 2018), Calgary, Canada, March 15-20, 2018, pp. 6503–6507.
  • [15] W. Liu and J. Lee,

    “An efficient residual learning neural network for hyperspectral image superresolution,”

    IEEE J. Sel. Top. Appl. Earth. Obs. Remote. Sens., vol. 12, no. 4, pp. 1240–1253, Apr. 2019.
  • [16] Y. Romano, M. Elad, and P. Milanfar, “The little engine that could: Regularization by denoising (RED),” SIAM J. Imaging Sci., vol. 10, no. 4, pp. 1804–1844, 2017.
  • [17] E. T. Reehorst and P. Schniter, “Regularization by denoising: Clarifications and new interpretations,” IEEE Trans. Comput. Imag., vol. 5, no. 1, pp. 52–67, Mar. 2019.
  • [18] G. Mataev and P. Elad, M. Milanfar, “DeepRED: Deep image prior powered by RED,” 2019, arXiv:1903.10176 [cs.CV].
  • [19] Y. Sun, J. Liu, and U. S. Kamilov, “Block coordinate regularization by denoising,” May 2019, arXiv:1905.05113.
  • [20] Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course, Kluwer Academic Publishers, 2004.
  • [21] A. Beck and M. Teboulle, Convex Optimization in Signal Processing and Communications, chapter Gradient-Based Algorithms with Applications to Signal Recovery Problems, pp. 42–88, Cambridge, 2009.
  • [22] J. M. Bioucas-Dias and M. A. T. Figueiredo, “A new TwIST: Two-step iterative shrinkage/thresholding algorithms for image restoration,” IEEE Trans. Image Process., vol. 16, no. 12, pp. 2992–3004, December 2007.
  • [23] A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM J. Imaging Sciences, vol. 2, no. 1, pp. 183–202, 2009.
  • [24] U. S. Kamilov, “A parallel proximal algorithm for anisotropic total variation minimization,” IEEE Trans. Image Process., vol. 26, no. 2, pp. 539–548, February 2017.
  • [25] S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plug-and-play priors for model based reconstruction,” in Proc. IEEE Global Conf. Signal Process. and INf. Process. (GlobalSIP), Austin, TX, USA, December 3-5, 2013, pp. 945–948.
  • [26] S. Sreehari, S. V. Venkatakrishnan, B. Wohlberg, G. T. Buzzard, L. F. Drummy, J. P. Simmons, and C. A. Bouman,

    “Plug-and-play priors for bright field electron tomography and sparse interpolation,”

    IEEE Trans. Comp. Imag., vol. 2, no. 4, pp. 408–423, December 2016.
  • [27] Y. Sun, B. Wohlberg, and U. S. Kamilov, “An online plug-and-play algorithm for regularized image reconstruction,” IEEE Trans. Comput. Imaging, 2019.
  • [28] S. A. Bigdeli, M. Jin, P. Favaro, and M. Zwicker, “Deep mean-shift priors for image restoration,” in Proc. Advances in Neural Information Processing Systems 31, Long Beach, CA, USA, Dec. 2017.
  • [29] N. Parikh and S. Boyd, “Proximal algorithms,” Foundations and Trends in Optimization, vol. 1, no. 3, pp. 123–231, 2014.
  • [30] H. Sedghi, V. Gupta, and P. M. Long,

    “The singular values of convolutional layers,”

    in International Conference on Learning Representations (ICLR), New Orleans, LA, USA, May 2019.
  • [31] H. H. Bauschke and P. L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, Springer, 2 edition, 2017.
  • [32] E. K. Ryu and S. Boyd, “A primer on monotone operator methods,” Appl. Comput. Math., vol. 15, no. 1, pp. 3–43, 2016.
  • [33] Y. Monno, S. Kikuchi, M. Tanaka, and M. Okutomi, “A practical one-shot multispectral imaging system using a single image sensor,” IEEE Transactions on Image Processing., vol. 24, no. 10, pp. 3048–3059, 2015.
  • [34] J. Liu, Y. Sun, X. Xu, and U. S. Kamilov, “Image restoration using total variation regularized deep image prior,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Process., Brighton, UK, May 2019, pp. 7715–7719.
  • [35] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising,” IEEE Trans. Image Process., vol. 26, no. 7, pp. 3142–3155, July 2017.