1 Introduction
Multispectral imaging is the process of recording 2D arrays of information at multiple spectra (frequencies). Having access to such a rich, threedimensional data cube allows different materials to be distinguished due to their differing spectral emission profiles. As a result, multispectral imaging is used in applications ranging from landmine detection, precision agriculture, and medical diagnosis to name but a few of its application domains; for a partial survey see the January 2014 special issue of IEEE Signal Processing Magazine [1]. The increased sensor size and acquisition time are some of the central obstacles to the more widespread use of multispectral imagery.
Snapshot mosaic multispectral sensors allow for a compact video rate multispectral imaging by acquiring only a fraction of the multispectral cube. For example^{1}^{1}1We illustrate the architecture through the IMEC sensor, but note there are numerous similar sensors such as the S 137 system by CUBERT., the IMEC SNm4x4 records 16 bands at a rate of 340 frames per second on a spatial twodimensional pixel domain by only acquiring a single spectrum per pixel; specifically this is achieved by tiling the twodimensional domain by pixel supercells where each supercell acquires the spectra independently.
Herein we demonstrate the efficacy of multiple methods for interpolating the missing values in the above snapshot mosaic data cube by simulating the undersampling from complete threedimensional data cubes provided by DSTL as well as AVIRIS
[2], Stanford SCIEN [3], Nascimento [4], Foster [5], IEEE GRSS Data Fusion Contest [6]. In addition to reviewing the existing stateoftheart interpolation methods in Sec. 2.1, we demonstrate sparse approximation and matrix completion methods in Sec. 2.2 and 2.3 respectively, which we observe to substantially outperform the prior stateoftheart. Over the above diverse data sets, we observe that nonconvex compressed sensing and matrix completion methods initialised with traditional interpolations methods typically improve the peak signaltonoise ratio by to , see Table 1.2 Algorithms for demosaicing
Demosaicing is the process by which the undersampled threedimensional multispectral data cube has the missing entries approximated so as to simulate a full data acquisition. While most threedimensional interpolations methods would be directly applicable, we consider a few methods previously used by the multispectral community, such as direct interpolation as described in Sec. 2.1 as well as sparse approximation regularisation methods in Sec. 2.2 and lowrank structure as presented in Sec. 2.3.
2.1 Direct interpolation methods
Brauers et al. [7]
developed methods to estimate the missing values in the multispectral cube based on extending a spatial bilinear interpolation of the measured values to include any spectral correlation. The weighted bilinear interpolation (WB) for the
pixel regular mosaic filter follows by padding the missing entries with zeros and convolving with the cartesian product of a discrete sevenpixel width filter
. Then, the spectral correlation is included in the spectral difference (SD) method by a) taking the output of WB to independently compute, for each band, say , the difference between the values of the measured pixels for spectrum and the WB interpolated values of every other band restricted to the support of the measured pixels of spectrum , then b) applying WB to these spectral differences c) to form an approximation of the full spectrum by adding the output of step (b) to the difference with at the location of the measured pixels for spectrum .Mihoubi et al. [8] extended the SD approach to consider alternative ways to build correlations between the bands. In intensity difference (ID) they build spectral correlation by first constructing a spatial intensity map whose value at a pixel is the measurement for whichever spectra was measured at that spatial location, then this intensity map is averaged using a weighting based on the number of pixels per spectra contained in the averaging width. See Sec.3.2 for details on the choice of averaging used here and Mihoubi et al. [8] for a more general discussion. Hence, the difference between this averaged intensity map and the measurements for each spectrum is computed, the unknown values zero padded, and each band averaged such as in WB.
Interpolation methods designed in transform domains have been considered by Miao et al. [9] in the binary treebased edgesensing (BTES) method, which has the additional benefit of allowing for variable sampling densities per frequency band. However, we observe it is inferior to SD and ID described above in the setting of snapshot imaging. Pseudopanchromatic image difference (PPID) [10] builds upon BTES and ID. However, due to the applicability of PPID to only some specific mosaic arrangements we leave comparison with our algorithms for a later time.
2.2 Sparse approximation and compressed sensing inpainting
Sparse approximation inpainting allows one to easily consider the interpolation of the undercomplete snapshot data cube in transforms more general than the linear interpolation of (WB). In particular, one can assume that the image is well approximated by a sparse representation in a suitable image domain and exploit this structure to reconstruct it from undersampled measurements [11, 12, 13], e.g. by solving
(1) 
where represents the transform in which the data is known to be compressible and is the full data cube projected by to the undersampled locations. Degraux et al. [14] apply this model to a reconstruction of multispectral imagery acquired by mosaic snapshot cameras.
The primary challenge lies in two aspects: (i) the significant subsampling ratio of , where is the number of spectral bands, and (ii) the selection of the suitable transform . The first problem can be overcome by initialising the stateoftheart sparse approximation algorithms for solving (1) with sufficiently accurate initial estimates, such as those from the classical interpolation methods described in Sec. 2.1. As it was pointed out in [14], we find that, for natural scenes captured by snapshot multispectral imaging, a Kronecker product of 2D wavelet transform spatially and the discrete cosine transform for the spectral dimension is an effective choice for the representation . In particular, the 2D wavelet transform spatially includes elements of nearly global support to allow broad correlations as well as local elements to express fine detail and the discrete cosine transform models the slowly varying values in the spectral dimension.
2.3 Matrix completion
Rather than using local correlations, matrix (and tensor) completion exploit the correlation in the data cube through a lowrank structure, e.g. by solving
(2) 
where is the observed data, is a matrix corresponding to an unfolding of the complete threedimensional data cube, and is a restriction to the measured values as before.
Although the lowrank matrix completion problem is NPhard in general (see [15] for a recent survey) there is a number of computationally fast solvers for the problem with provable convergence guarantees [16, 17, 18]. In fact, matrix completion has been previously applied to the reconstruction of subsampled multispectral imagery [19] by the Coded Aperture Snapshot Spectral Imager (CASSI) [20]. Here we show that matrix completion can be used also in the case of a more severe subsampling by snapshot mosaic camera designs if provided with suitable initialisation.
While there are many nonconvex methods for matrix completion, we showcase two exemplary cases but expect that other nonconvex methods would perform similarly well. We apply conjugate gradient iterative hard thresholding (CGIHT) [18] and alternating steepest descent (ASD) [17] to solve (2), providing them with an initial guess from either SD or ID. We show that both CGIHT and ASD can improve on the classical interpolation methods. This differs substantially from prior work both in terms of using more recent algorithms for matrix completion which have been shown to be more effective and initialising them with prior stateoftheart interpolation methods, and moreover in that unlike [19] which treat each spectral band separately with undersampling, we vectorise the spatial dimensions to create a matrix of size by with undersampling. We observe that this unfolding which allows full correlation between the spectral information is particularly effective, often resulting in reconstructions which are visually indistinguishable from fully acquired data.
3 Numerical simulations
In this section, we show and explain the numerical results obtained by applying the methods discussed above.
3.1 Data sets
We consider the efficacy of the algorithms for demosaicing on the following data sets:

Low altitute airborne images acquired at DSTL Porton Down, in August 2014, from which we selected representative radiance images of fields from a HySpex VNIR1600 linescan camera in the range to .
We processed these data sets to simulate the spectrum measures by the IMEC SNm4x4 snapshot sensor with access to the complete data cube. Then, we undersampled the data cube following the sensor sampling pattern and the following simulations conducted.
3.2 Simulation setup
We implement and test recovery by two interpolation methods ID and SD, two matrix completion methods ASD and CGIHT and a compressed sensing version of CGIHT with a sparsifying transform as a Kronecker products of 2D Daubechies wavelets (W2) in the spatial and 1D Discrete Cosine Transform (DCT) in the spectral domain which we reference as W2DCT.
The iterative algorithms are terminated once the error in iteration is or at the iteration.
We report the quality of an image approximated by demosaicing by the peak signaltonoise ratio (PSNR), defined as the log of the ratio between the maximum possible power of (the slide of) an image and the power of corrupting noise that affects the fidelity of its representation, computed in terms of the average squared difference (or mean squared error, MSE) between the reference image and its reconstruction:
(3) 
where and are the th band slices of the reference cube and the reconstruction, respectively, and denotes the set of all pixels.
We also employ the structural similarity (SSIM) index [21], which is a decimal value between and , with being reachable only in the case of two identical sets of data. SSIM is a perceptionbased model that considers image degradation as perceived change in structural information, while also incorporating important perceptual phenomena, including both luminance masking and contrast masking terms.
3.3 Results
Figure 0(a) shows the PSNR of each spectrum given its band centre, for a sample image from Nascimento [4], for the classical interpolation methods SD and ID as well as the compressed sensing (CS) and matrix completion techniques initialised with SD. Notice that, except for the first band, the matrix completion algorithms outperform SD and ID. On the other hand, the compressed sensing approach does not improve on the interpolation methods. In particular, note the overall incorrect contrast level resulting in yellowing of Figure 0(b). Moreover, we lose the sharpness of the edges in the balcony when employing CS W2DCT (Figure 0(b)), while we recover it with the matrix completion variant of CGIHT (Figure 0(c)).
To further emphasise how the different algorithms differ from each other we show the reconstructions PSNR from the corresponding reference images in Figure 2. Note in Figure 1(c) how ID accurately reconstructs the field image, taking into account the spectral correlation between the bands, but smooths the sharp details. The compressed sensing approach does a better job in identifying the edges (Figure 1(b)), but suffers from the same problem overall. On the other hand, the matrix completion CGIHT outperforms the other methods due to the presence of fieldlike uniformities.
IMAGE  INIT  ASD  CGIHT  W2DCT  

D0201  ID  
SD  
D0301  ID  
SD  
D0303  ID  
SD  
D0307  ID  
SD  
D2301  ID  
SD  
AvLF  ID  
SD  
StCh  ID  
SD  
N04  ID  
SD  
N06  ID  
SD  
N08  ID  
SD  
F05  ID  
SD  
F06  ID  
SD  
F07  ID  
SD  
GB  ID  
SD  
GD  ID  
SD  
GR  ID  
SD 
Average PSNR over the 16 bands, with standard deviations. The best results for each image are highlighted in bold.
Finally, as shown in Table 1, in the majority of cases the best performing algorithms are the matrix completion CGIHT and ASD initialised with SD, with improvements over both SD and ID from to . StCh from Stanford SCIEN [3]
seems to be the only outlier, with an improvement of just
.Being directly related to the number of bands, the rank of the spectral unfolding seems to be effective in capturing the spectral information of the analysed datasets. Our results suggest a high correlation between the frequency bands and a lowrank structure of the spectral unfolding of our images, in which most of the information is contained in the first 3 singular values of the spectral unfolding.
4 Conclusions
We provide a numerical comparison of multispectral demosaicing by traditional interpolation, sparse approximation and matrix completion methods. Our experiments demonstrate that nonconvex matrix completion typically improves reconstruction by to over the current stateoftheart methods. This differs substantially from prior work in terms of employing matrix completion on the spectral unfolding of the image in the context of demosaicing, initialising it with classical interpolation methods and using more recent nonconvex matrix completion algorithms.
References
 [1] W. Ma, J. M. BioucasDias, J. Chanussot, and P. Gader, “Signal and image processing in hyperspectral remote sensing,” IEEE Signal Processing Magazine, vol. 31, no. 1, pp. 22–23, 2014.
 [2] G. Vane, R. O. Green, T. G. Chrien, H. T. Enmark, E. G. Hansen, and W. M. Porter, “The airborne visible/infrared imaging spectrometer (aviris),” Remote Sensing of Environment, vol. 44, no. 23, pp. 127–143, 1993.
 [3] T. Skauli and J. Farrell, “A collection of hyperspectral images for imaging systems research,” in Proc. SPIE 8660, Digital Photography IX, 2013, p. 86600C.
 [4] S. M. C. Nascimento, F. P. Ferreira, and D. H. Foster, “Statistics of spatial coneexcitation ratios in natural scenes,” Journal of the Optical Society of America A, vol. 19, no. 8, pp. 1484, 2002.
 [5] D. H. Foster, S. M. C. Nascimento, and K. Amano, “Information limits on neural identification of colored surfaces in natural scenes,” Visual neuroscience, vol. 21, no. 3, pp. 331–6, 2004.
 [6] “IEEE GRSS Data Fusion Contest,” http://www.grssieee.org/community/technicalcommittees/datafusion/datafusioncontest/, 2018, [Online; accessed 29Oct2018].

[7]
J. Brauers and T. Aach,
“A color filter array based multispectral camera,”
Tech. Rep., Institute of Imaging and Computer Vision, RWTH Aachen University, 2006.
 [8] S. Mihoubi, O. Losson, B. Mathon, and L. Macaire, “Multispectral demosaicing using intensitybased spectral correlation,” in 2015 International Conference on Image Processing Theory, Tools and Applications (IPTA), 2015, pp. 461–466.
 [9] L. Miao, H. Qi, R. Ramanath, and W. E. Snyder, “Binary treebased generic demosaicking algorithm for multispectral filter arrays,” IEEE Transactions on Image Processing, vol. 15, no. 11, pp. 3550–3558, 2006.
 [10] S. Mihoubi, O. Losson, B. Mathon, and L. Macaire, “Multispectral demosaicing using pseudopanchromatic image,” IEEE Transactions on Computational Imaging, vol. 3, no. 4, pp. 982–995, 2017.

[11]
G. Kutyniok, E. King, and W. Lim,
“Image inpainting: Theoretical analysis and comparison of algorithms,”
in SPIE Proceedings, Wavelets and Sparsity, 2013, vol. XV(0), pp. 1–11.  [12] B. Dong, H. Ji, J. Li, Z. Shen, and Y. Xu, “Wavelet frame based blind image inpainting,” Applied and Computational Harmonic Analysis, vol. 32, no. 2, pp. 268 – 279, 2012.
 [13] M. Elad, J.L. Starck, P. Querre, and D. L. Donoho, “Simultaneous cartoon and texture image inpainting using morphological component analysis (mca),” Applied and Computational Harmonic Analysis, vol. 19, no. 3, pp. 340 – 358, 2005, Computational Harmonic Analysis  Part 1.
 [14] K. Degraux, V. Cambareri, L. Jacques, B. Geelen, C. Blanch, and G. Lafruit, “Generalized inpainting method for hyperspectral image acquisition,” in 2015 IEEE International Conference on Image Processing (ICIP), 2015, pp. 315–319.
 [15] M. A. Davenport and J. Romberg, “An overview of lowrank matrix recovery from incomplete observations,” IEEE Journal of Selected Topics in Signal Processing, vol. 10, no. 4, pp. 608–622, 2016.
 [16] Z. Wen, W. Yin, and Y. Zhang, “Solving a lowrank factorization model for matrix completion by a nonlinear successive overrelaxation algorithm,” Mathematical Programming Computation, vol. 4, no. 4, pp. 333–361, 2012.
 [17] J. Tanner and K. Wei, “Low rank matrix completion by alternating steepest descent methods,” Applied and Computational Harmonic Analysis, vol. 40, no. 2, pp. 417–429, 2016.
 [18] J. D. Blanchard, J. Tanner, and K. Wei, “Cgiht: Conjugate gradient iterative hard thresholding for compressed sensing and matrix completion,” Information and Inference: A Journal of the IMA, vol. 4, no. 4, pp. 289–327, 2015.
 [19] T. Gélvez, H. Arguello, and H. Rueda, “Coded aperture design for hyperspectral image recovery via matrix completion,” in 2015 20th Symposium on Signal Processing, Images and Computer Vision (STSIVA), 2015, pp. 1–7.
 [20] M. E. Gehm, R. John, D. J. Brady, R. M. Willett, and T. J. Schulz, “Singleshot compressive spectral imaging with a dualdisperser architecture,” Opt. Express, vol. 15, no. 21, pp. 14013–14027, 2007.
 [21] Z. Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
Comments
There are no comments yet.