1 Introduction
Multispectral (MS) imaging systems acquire the response from an object or a scene over a wide range of frequency bands, including optical, infrared, and shortwave infrared [1, 2]
. Multiband spectra provide rich information for detecting and classifying materials, especially those that have similar visible colors. Additionally, the higher transmission property of infrared bands, when compared to optical bands, makes the MS imaging beneficial in imaging through haze or fog. As a result, multispectral or even hyperspectral imaging techniques have gained popularity in a number of important applications in astronomy, agriculture, and medical imaging
[3, 4, 5, 6].While MS imaging has a great potential, acquiring and processing corresponding data is challenging due to various hardware limitations and the highdimensional nature of typical datasets. In particular, the resolution and signaltonoise ratio (SNR) of MS images is often constrained by limitations on the size, weight, and power of sensors used for data acquisition. Image reconstruction methods have been developed for mitigating such hardware limitations by using advanced priors on the unknown MS image. Traditional priors used in MS image reconstruction include sparsitybased regularizers, lowrank models, and dictionarylearningbased priors [7, 8, 9, 10, 11]. Recently, however, the focus in the field has been shifting towards deep learning techniques that are based on learning the direct mapping from the measurements to the recovered MS image [12, 13, 14, 15]. However, the conceptual simplicity of endtoend learning comes with the loss of modularity, characterized by the lack of an explicit separation between the measurement model and the prior. This limits the generalization and reuse of previously trained models.
In this paper, we develop a new MS image reconstruction method called multispectral regularization by denoising (MSRED) that enables the infusion of deep neural net (DNN) priors while maintaining an explicit separation between the prior and the measurement model. By building on the recently developed RED framework [16, 17, 18, 19], our method specifies the learned prior only via a 3D spatiospectral denoising function, trained for the removal of additive white Gaussian noise (AWGN) from MS images. MSRED naturally leverages both spatial and spectral correlations in the data, while also explicitly enforcing fidelity to the measured data. We discuss the convergence of MSRED under a set of transparent assumptions on the datafidelity and the denoiser, and develop its fast variant based on the accelerated gradient method (AGM). We finally demonstrate our MSRED algortihm on MS superresolution using several denoising priors, including those based on DNNs. Our results illustrate that a single trained DNN denoiser can provide a stateoftheart performance while solving multiple inverse problems without additional retraining.
2 Proposed Method
2.1 Inverse Problem Formulation
We consider the problem of acquiring a MS data cube, consisting of two spatial dimensions and one spectral dimension. The cube is represented in a vectorized form as
. We additionally model the MS image acquisition process as(1) 
where represents the acquisition model, is a collection of measurements generated by our imaging system (where potentially ) and is noise.
Since the recovery of from in (1) is illposed, it is common to formulate it as a regularized optimization problem
(2) 
where is the datafidelity term that accounts for the measurement model and is the regularizer that infuses the prior information. For example, when the noise in (1) is AWGN, one often sets . On the other hand, sparsitybased regularization is obtained by setting , where is a some spatiospectral transform and is the regularization parameter [7, 11].
Accelerated gradient method (AGM), summarized in Algorithm 1, is a wellstudied iterative method, effective in largescale setting, for solving (2) when is a differentiable function [20, 21]. It can be further extended to circumvent the need to differentiate by using proximal operators [22, 23, 24]. Note that the switch between the standard gradient method (GM) and AGM in Algorithm 1 is controlled via the sequence . When for all , we obtain GM with convergence rate; when
(3) 
the algorithms becomes AGM with convergence. Here, we extend the traditional GM/AGM approach by including a learned denoising function that characterizes a MS image prior.
2.2 Multispectral Regularization by Denoising
In the spirit of AGM, we consider Algorithm 2
, which iteratively estimates
by combining the usual gradient of datafidelity term with an operator , where is the regularization parameter and is a spactiospectral denoiser. When the algorithm converges, it converges to the vectors in the zero set of the operatorConsider the following two sets
where is the set of all critical points of the datafidelity and is the set of all fixed points of the denoiser. Intuitively, the fixed points of correspond to all the vectors that are not denoised, and therefore can be interpreted as vectors that are noisefree according to the denoiser.
Note that if , then and is one of the solutions of RED. Hence, any vector that is consistent with the data for a convex and noiseless according to is in the solution set. On the other hand, when , then corresponds to a tradeoff between the two sets, explicitly controlled via . This explicit control in the strength of regularization is one of the key differences between RED and an alternative plugandplay priors (PnP) framework [25, 26, 27].
The key benefits of our MSRED approach is in its explicit separation of the measurement model from the prior, its ability to accommodate powerful spatiospectral denoisers (such as the ones based on DNNs) without differentiating them, and their excellent recovery performance as corroborated by our simulations in Section
2.3 Convergence Analysis
Under some conditions on the denoiser, it is possible to relate to some explicit regularization function . For example, when the denoiser is locally homogeneous and has a symmetric Jacobian [16, 17], the operator corresponds to the gradient of
(4) 
An alternative justification of RED for another explicit function is possible by considering minimum mean squared error (MMSE) denoisers [28, 17]. In this paper, we perform a fixed point analysis of MSRED for denoisers that do not necessarily correspond to any regularizer . Our analysis complements the analysis in [17] by developing an explicit worstcase convergence rate. We require three assumptions that together serve as sufficient conditions for convergence.
Assumption 1.
The operator is such that . There is a finite number such that the distance of the initial to the farthest element of is bounded, that is
This assumption is related to the existence of the minimizers in the traditional optimization literature [20]. The next two assumptions rely on Lipschitz continuity of the datafidelity and the denoiser.
Assumption 2.
The function is smooth and convex. It has a Lipschitz continuous gradient with constant .
This is a standard assumption used in gradientbased methods, including for the GM and its proximal extensions [20].
Assumption 3.
The denoiser is Lipschitz continuous with constant .
Since the proximal operator is nonexpansive [29], it automatically satisfies this assumption. Spectralnormalization can be used to make DNNs satisfy this assumption [30].
Theorem 1.
A proof of the theorem is provided in Section 5. Theorem 1 establishes that the iterates of MSRED can get arbitrarily close to with rate. The proof relies on the monotone operator theory [31, 32], widely used in the context of convex optimization [29].
The analysis in Theorem 1 only provides sufficient conditions for the convergence of MSRED. As corroborated by our numerical studies in Section, the actual convergence of MSRED is more general. It often holds with the Nesterov’s acceleration in eq. (3) and beyond nonexpansive denoisers. This suggests that some denoisers might be locally nonexpansive over the set of input vectors used in testing. On the other hand, DNNs can be made globally nonexpansive via spectralnormalization [30], which results in provable convergence.
3 Numerical Experiments
In this section, we provide several simulations validating our theoretical analysis on multispectral image supperresolution (MSISR). Generally, the lowresolution (LR) image can be modeled by a blurring and subsequent downsampling operation on a high resolution one. We test several image denoisers, including 3DTV, BM3D, and our own DnCNN on MS images from TokyoTech database [33]. Originally, those images are of size 50050031 with the wavelengths in the range of 420720 nm at an interval of 10nm. However, for convenience, we restrict our simulations to 6 of 31 bands to build 500500
6 tensors. We implement two typical image degradation settings for MSISR, namely blurring by Gaussian kernel of size
with and blurring by motion kernel of size [34], both followed by downsampling with scale factors 2, 3 and 4. For both kernels, the measurements are corrupted with AWGN corresponding to input signaltonoise ratio (SNR) of 40 dB.To take advantage of the ability of DNNs to automatically learn the prior information on a specific image set, we train a residual DNN denoiser denoted as DnCNN, which is a simplified version of the popular DnCNN denoiser [35]. DnCNN
consists of seven layers with three different blocks. The first block is a composite convolutional layer, consisting of a normal convolutional layer and a rectified linear units (ReLU) layer. It convolves the
input to features maps by using 64 filters of size . The second block is a sequence of 5 composite convolutional layers, each having 64 filters of size . Those composite layers further processes the feature maps generated by the first part. The third block of the network, a single convolutional layer, generates the final output image with six channels by convolving the feature maps with afilter. Every convolution is performed with a stride
, so that the intermediate feature maps share the same spatial size of the input image. Moreover, the global Lipschitz constant of the DNN is controlled to be LC = 2 via spectralnormalization [30], which provides a necessary condition for nonexpansiveness of the denoiser (i.e., for ). We choose four MS images for testing, namely Butterfly, Fan, Cloth, and Doll. Three MS images are used for validation, and the rest twenty eight MS images are used for training. We further expand the training set by adopting patchwise training and using standard data augmentation strategies. We train three variants of DnCNN for the removal of AWGN, each using a separate noise level from . For each experiment, we select the denoiser providing the highest SNR performance. The BM3D denoiser was applied in a separable fashion along each MS band. The parameter of BM3D was also finetuned for each experiment from the set .Fig. 2 illustrates the convergence of MSRED by plotting the periteration normalized distance to averaged over all the test images. Table 1 summarizes the SNR values for different denoisers averaged over all the images. Fig. 3 visually illustrates the performance by forming pseudocolor images (composed of 22nd, 2nd, and 17th bands) of the results with the scale factor 2 (top) and 3 (bottom). The REDDnCNN achieves the best SNR performance while also delivering reliable convergence across all instances.
Kernel  Scale  3DTV  BM3D  DnCNN* 

Gaussian

24.01  24.67  25.20  
22.74  23.33  23.43  
20.41  20.94  21.03  
Motion blur

23.30  24.49  25.21  
20.17  20.85  21.01  
18.86  19.34  19.43 
Summary of average SNR performance of different methods for multispectral image superresolution.
4 Conclusion
We have presented the MSRED algorithms for reconstructing MS images from their noisy measurements. The algorithm relies on the integration of a 3D spatiospectral denoiser with the measurement model for maximally exploiting all the available information. In the case of convex datafidelity terms and nonexpansive denoisers, MSRED is provably convergent with the rate . However, our experiments suggest that the convergence of MSRED extends to AGM and beyond nonexpansive denoisers. Therefore, MSRED is applicable to a large class of MS problems and our future work will include possible extensions to different inverse problems.
5 Proof of the Theorem
A fixedpoint convergence of averaged operators is wellknown under the name of Krasnosel’skiiMann theorem (see Section 5.2 in [31]) and was recently applied to the analysis of PnP [27] and several RED algorithms in [17, 19]. Our analysis here extends these results to MSRED by providing explicit worstcase convergence rates.
In the proof below, we use the shorthand notation and . We consider the following operators
and proceed in several steps by relying on fundamental results from monotone operator theory (Section 5.2 in [31]).

[leftmargin=*]

Since is Lipschitz continuous, we know that it is cocoercive. Hence, the operator is nonexpansive.

From the definition of and the fact that is nonexpansive, we know that is nonexpansive.

We know that a convex combination of nonexpansive operators is also nonexpansive, hence we conclude that
(6) is nonexpansive. This implies that that is cocoercive.

Consider any and a single iteration of MSRED . We then have
(7) where we used , the cocoercivity of , and the fact that .

By rearranging the terms and averaging over iterations, we obtain
(8)
References
 [1] R. M. Willett, M. F. Duarte, M. A. Davenport, and R. G. Baraniuk, “Sparsity and structure in hyperspectral imaging,” IEEE Signal Process. Mag., vol. 31, no. 1, pp. 116–126, Jan. 2014.
 [2] G. R. Arce, D. J. Brady, L. Carin, H. Arguello, and D. S. Kittle, “Compressive coded aperture spectral imaging,” IEEE Signal Process. Mag., vol. 31, no. 1, pp. 105–115, Jan. 2014.
 [3] J. Solomon and B. Rock, “Imaging spectrometry for earch remote sensing,” Science, vol. 228, no. 4704, pp. 1147–1152, 1985.
 [4] Y.R. Chen, K. Chao, and M. S. Kim, “Machine vision technology for agricultural applications,” Comput. Electron. Agr., vol. 36, no. 23, pp. 173–191, Nov. 2002.
 [5] E. Hege, D. O’Connell, W. Johnson, S. Basty, and E. Dereniak, “Hyperspectral imaging for astronomy and space surviellance,” Proc. Imaging Spectrometry IX, vol. 5159, pp. 380–391, 2004.
 [6] G. Lu and B. Fei, “Medical hyperspectral imaging: a review,” J. Biomed. Opt., vol. 19, no. 1, 2014.
 [7] H. Othman and S.E. Qian, “Noise reduction of hyperspectral imagery using hybrid spatialspectral derivativedomain wavelet shrinkage,” IEEE Trans. Geosci. Remote Sens., vol. 44, no. 2, pp. 397–408, Feb. 2006.
 [8] A. S. Charles, B. A. Olshausen, and C. J. Rozell, “Learning sparse codes for hyperspectral imagery,” IEEE J. Sel. Topics Signal Process., vol. 5, no. 5, pp. 963–978, Sept. 2011.
 [9] C. Lanaras, E. Baltsavias, and K. Schindler, “Hyperspectral superresolution by coupled spectral unmixing,” in Proc. IEEE Int. Conf. Comp. Vis. (ICCV), Santiago, Chile, Dec. 2015, pp. 3586–3594.
 [10] C. I. Kanatsoulis, X. Fu, N. D. Sidiropoulos, and W.K. Ma, “Hyperspectral superresolution: A coupled tensor factorization approach,” IEEE Trans Signal Process, vol. 66, no. 24, pp. 6503–6517, Dec. 2018.
 [11] Y. Wang, J. Peng, Q. Zhao, Y. Leung, X.L. Zhao, and D. Meng, “Hyperspectral image restoration via total variation regularized lowrank tensor decomposition,” IEEE J. Sel. Top. Appl. Earth. Obs. Remote Sens., vol. 11, no. 4, pp. 1227–1243, Apr. 2018.
 [12] J. Hu, Y. Li, X. Zhao, and W. Xie, “A spatial constraint and deep learning based hyperspectral image superresolution method,” in Proc. IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA, July 2017, pp. 5129–5132.
 [13] Z. Xiong, Z. Shi, H. Li, L. Wang, D. Liu, and F. Wu, “HSCNN: CNNbased hyperspectral image recovery from spectrally undersampled projections,” in Proc. IEEE Int. Conf. Comp. Vis. Workshops (ICCV WKSHP), Venice, Italy, Oct. 2017, pp. 518–525.
 [14] B. Wen, U. S. Kamilov, D. Liu, H. Mansour, and P. T. Boufounos, “DeepCASD: An endtoend approach for multispectral image superresolution,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Process. (ICASSP 2018), Calgary, Canada, March 1520, 2018, pp. 6503–6507.

[15]
W. Liu and J. Lee,
“An efficient residual learning neural network for hyperspectral image superresolution,”
IEEE J. Sel. Top. Appl. Earth. Obs. Remote. Sens., vol. 12, no. 4, pp. 1240–1253, Apr. 2019.  [16] Y. Romano, M. Elad, and P. Milanfar, “The little engine that could: Regularization by denoising (RED),” SIAM J. Imaging Sci., vol. 10, no. 4, pp. 1804–1844, 2017.
 [17] E. T. Reehorst and P. Schniter, “Regularization by denoising: Clarifications and new interpretations,” IEEE Trans. Comput. Imag., vol. 5, no. 1, pp. 52–67, Mar. 2019.
 [18] G. Mataev and P. Elad, M. Milanfar, “DeepRED: Deep image prior powered by RED,” 2019, arXiv:1903.10176 [cs.CV].
 [19] Y. Sun, J. Liu, and U. S. Kamilov, “Block coordinate regularization by denoising,” May 2019, arXiv:1905.05113.
 [20] Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course, Kluwer Academic Publishers, 2004.
 [21] A. Beck and M. Teboulle, Convex Optimization in Signal Processing and Communications, chapter GradientBased Algorithms with Applications to Signal Recovery Problems, pp. 42–88, Cambridge, 2009.
 [22] J. M. BioucasDias and M. A. T. Figueiredo, “A new TwIST: Twostep iterative shrinkage/thresholding algorithms for image restoration,” IEEE Trans. Image Process., vol. 16, no. 12, pp. 2992–3004, December 2007.
 [23] A. Beck and M. Teboulle, “A fast iterative shrinkagethresholding algorithm for linear inverse problems,” SIAM J. Imaging Sciences, vol. 2, no. 1, pp. 183–202, 2009.
 [24] U. S. Kamilov, “A parallel proximal algorithm for anisotropic total variation minimization,” IEEE Trans. Image Process., vol. 26, no. 2, pp. 539–548, February 2017.
 [25] S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plugandplay priors for model based reconstruction,” in Proc. IEEE Global Conf. Signal Process. and INf. Process. (GlobalSIP), Austin, TX, USA, December 35, 2013, pp. 945–948.

[26]
S. Sreehari, S. V. Venkatakrishnan, B. Wohlberg, G. T. Buzzard, L. F. Drummy,
J. P. Simmons, and C. A. Bouman,
“Plugandplay priors for bright field electron tomography and sparse interpolation,”
IEEE Trans. Comp. Imag., vol. 2, no. 4, pp. 408–423, December 2016.  [27] Y. Sun, B. Wohlberg, and U. S. Kamilov, “An online plugandplay algorithm for regularized image reconstruction,” IEEE Trans. Comput. Imaging, 2019.
 [28] S. A. Bigdeli, M. Jin, P. Favaro, and M. Zwicker, “Deep meanshift priors for image restoration,” in Proc. Advances in Neural Information Processing Systems 31, Long Beach, CA, USA, Dec. 2017.
 [29] N. Parikh and S. Boyd, “Proximal algorithms,” Foundations and Trends in Optimization, vol. 1, no. 3, pp. 123–231, 2014.

[30]
H. Sedghi, V. Gupta, and P. M. Long,
“The singular values of convolutional layers,”
in International Conference on Learning Representations (ICLR), New Orleans, LA, USA, May 2019.  [31] H. H. Bauschke and P. L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, Springer, 2 edition, 2017.
 [32] E. K. Ryu and S. Boyd, “A primer on monotone operator methods,” Appl. Comput. Math., vol. 15, no. 1, pp. 3–43, 2016.
 [33] Y. Monno, S. Kikuchi, M. Tanaka, and M. Okutomi, “A practical oneshot multispectral imaging system using a single image sensor,” IEEE Transactions on Image Processing., vol. 24, no. 10, pp. 3048–3059, 2015.
 [34] J. Liu, Y. Sun, X. Xu, and U. S. Kamilov, “Image restoration using total variation regularized deep image prior,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Process., Brighton, UK, May 2019, pp. 7715–7719.
 [35] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising,” IEEE Trans. Image Process., vol. 26, no. 7, pp. 3142–3155, July 2017.
Comments
There are no comments yet.