KRISM --- Krylov Subspace-based Optical Computing of Hyperspectral Images

by   Vishwanath Saragadam, et al.

Low-rank modeling of hyperspectral images has found extensive use in numerous inference tasks. In this paper, we present an adaptive imaging technique that optically computes a low-rank representation of the scene's hyperspectral image. We make significant contributions towards simultaneously highly resolvable spectral and spatial measurements by introducing pupil coding. The proposed imager, KRISM, provides optical implementation of two operators on the scene's hyperspectral image --- namely, a spectrally-coded spatial measurement and a spatially-coded spectral measurement. By iterating between the two operators, using the output of one as the input to the other, we show that the top singular vectors and singular values of a hyperspectral image can be computed in the optical domain with very few measurements. We present an optical setup and show several compelling real world examples that demonstrate the effectiveness of our proposed algorithm.


page 6

page 9

page 11

page 12

page 13

page 14

page 15

page 16


Fast Hyperspectral Image Recovery via Non-iterative Fusion of Dual-Camera Compressive Hyperspectral Imaging

Coded aperture snapshot spectral imaging (CASSI) is a promising techniqu...

Unsupervised Spatial-spectral Network Learning for Hyperspectral Compressive Snapshot Reconstruction

Hyperspectral compressive imaging takes advantage of compressive sensing...

Understanding Non-optical Remote-sensed Images: Needs, Challenges and Ways Forward

Non-optical remote-sensed images are going to be used more often in man-...

Programmable Spectral Filter Arrays for Hyperspectral Imaging

Modulating the spectral dimension of light has numerous applications in ...

Snapshot Hyperspectral Imaging Based on Weighted High-order Singular Value Regularization

Snapshot hyperspectral imaging can capture the 3D hyperspectral image (H...

Endoscopic Depth Measurement and Super-Spectral-Resolution Imaging

Intra-operative measurements of tissue shape and multi/ hyperspectral in...

Hyperspectral recovery from RGB images using Gaussian Processes

Hyperspectral cameras preserve the fine spectral details of scenes that ...

1. Introduction

Hyperspectral images (HSIs) capture light intensity of a scene as a function of space and wavelength and have been used in numerous vision [Pan et al., 2003; Tarabalka et al., 2010; Kim et al., 2012], geo-science and remote sensing applications [Cloutis, 1996; Harsanyi and Chang, 1994]. Traditional approaches for hyperspectral imaging, including tunable spectral filters and pushbroom cameras, rely on sampling the HSI, i.e., measuring the amount of light in each spatio-spectral voxel. When imaging at high-spatial and spectral resolutions, the amount of light in a voxel can be quite small, requiring long exposures to mitigate the effect of noise.

HSIs are often endowed with rich structures that can be used to alleviate the challenges faced by traditional imagers. For example, natural scenes are often comprised of a few materials of distinct spectra and further, illumination of limited spectral complexity [Chakrabarti and Zickler, 2011; Finlayson et al., 1994]. This implies that collection of spectral signatures observed at various locations in a scene lies close to a low-dimensional subspace. Instead of sampling the HSI of the scene, one spatio-spectral voxel at a time, we can dramatically speed-up acquisition and light throughput by measuring only projections on this low-dimensional subspace. However, such a measurement scheme requires a priori knowledge of the scene since this subspace is entirely scene dependent. This paper introduces an optical computing technique that identifies this subspace from an iterative and adaptive sensing strategy and constructs a low-rank approximation to the scene’s HSI.

We propose the design of an imager that estimates a low-rank approximation of a HSI by optically implementing the so called Krylov subspace method. At its core, the proposed imager optically implements two operators: a spatially-coded spectrometer and a spectrally-coded spatial imager; when we interpret the HSI as a 2D matrix, these two operators correspond to left and right multiplication of the matrix with a vector. The two operators are subsequently used in an iterative and adaptive imaging procedure whose eventual output is a low-rank approximation to the HSI. The proposed imager is adaptive, i.e., the measurement operator used to probe the scene’s HSI at a given iteration depends on previously made measurements. This is a marked departure from current hyperspectral imaging strategies where the signal model is merely used as a prior for recovery from non-adaptive measurements.


We propose an optical architecture that we refer to as KRylov subspace-based Imaging and SpectroMetry (KRISM) and make the following three contributions:

  • [leftmargin=*]

  • Optical computation of HSIs. We show that optical computing of HSIs to estimate its dominant singular vectors provides significant advantages in terms of increased light throughput and reducing measurement time.

  • Coded apertures for resolving space and spectrum. While high-resolution imaging and spectrometry have been studied extensively before, architectures suitable for one are often undesirable for others. In particular, we show that the use of slits in spectrometry and large open apertures in conventional imaging are ill-suited for the alternate task. To mitigate this, we study the effect of aperture plane coding on HSI and propose to use a coded aperture design that is simultaneously capable of high spatial and spectral resolutions.

  • Optical setup. We design and validate a novel and versatile optical implementation for KRISM that uses a single camera and a single spatial light modulator to efficiently implement spatially-coded spectral and spectrally-coded spatial measurements.

The contributions above are supported via an extensive set of simulations as well as real experiments performed using a lab prototype.


The benefits and contributions described above come with a key limitation:

  • [leftmargin=*]

  • Limited effectiveness when sensing few spectral bands. Our method is only advantageous if there are sufficient number of spectral bands and the hyperspectral image is sufficiently low rank. If we only seek to image with very few spectral bands or if the scene is not well approximated by a low-rank model, then the proposed method performs poorly against traditional sensing methods.

2. Prior work

Figure 2. HSIs are very well modeled by a low-rank approximation. We validate this observation by plotting reconstruction error of a low-rank approximation, in terms of SNR, as a function of the rank of the approximation. We do this for many commonly used HSI datasets. We observe that, across all datasets, the SNR is much higher than 30dB for a rank 10 approximation.
Nyquist sampling of HSIs.

Classical designs for hyperspectral imaging based on Nyquist sampling include the tunable filter — which scans the scene, one narrow spectral band at a time, measuring the image associated with spectral bands at each instant — or using a pushbroom camera — which scans the scene one row at a time, measuring the entire spectrum associated with pixels on the row at each instant. Both approaches are time-consuming as well as light inefficient since each captured image wastes a large percentage of light incident on the camera.

Multiplexed sensing.

The problem of reduced light throughput can be mitigated by the use of multiplexing. One of the seminal results in computational imaging is that the use of multiplexing codes including the Hadamard transform can often lead to significant efficiencies either in terms of increased SNR or faster acquisition [Harwit and Sloane, 1979]. This can either be spectral multiplexing [Mohan et al., 2008] or spatial multiplexing [Sun and Kelly, 2009]. While multiplexing mitigates light throughput issues, it does not reduce the number of measurements required. Sensing at high spatial and/or spectral resolution still requires long acquisition times to maintain a high SNR. Fortunately, HSIs have concise signal models that can be exploited to reduce the number of measurements.

Low-rank models for HSIs.

There are many approaches to approximate HSIs using low-dimensional models; this includes group sparsity in transform domain[Rasti et al., 2013], low rank model [Li et al., 2012; Golbabaee and Vandergheynst, 2012], as well as low-rank and sparse model [Waters et al., 2011; Saragadam et al., 2017]. Of particular interest to this paper is the low-rank modeling of HSIs when they are represented as a 2D matrix (See Figure 2). Low-rank models of HSIs and their variants have found numerous uses in vision and graphics including color constancy [Finlayson et al., 1994], color displays [Kauvar et al., 2015], endmember detection [Winter, 1999], source separation [Hui et al., 2018]

, anomaly detection

[Saragadam et al., 2017], compressive imaging [Golbabaee and Vandergheynst, 2012] and denoising [Zhao and Yang, 2015]. Chakrabarti and Zickler [2011] also provide empirical justification that HSIs of natural scenes are well represented by low rank models.

Table 1. Various sensing strategies for hyperspectral imaging of spatial dimension and spectral bands. Noise in measurement is assumed to be AWGN with variance. While CS techniques require fewer measurements, it is not immune to noise. Our method outperforms CS techniques with higher reconstruction accuracy and needs far fewer measurements.
Compressive hyperspectral imaging.

The low-rank model has also been used for compressive sensing (CS) of HSIs. CS aims to recover a signal from a set of linear measurements that are fewer than its dimensionality [Baraniuk, 2007]. CS achieves this by modeling the sensed signal using lower dimensional representations — low-rank matrices being one such example. The technique that is most relevant to this paper is that of row/column projection [Fazel et al., 2008]. Here, the measurement model is restricted to obtaining row and column projections of a matrix. Given a matrix , and measurement operators , the measurements acquired are of the following form,

When the matrix has a rank , it can be shown that it is sufficient to acquire images and spectral profiles with . In contrast, the method proposed in this paper requires only a number of measurements proportional to the rank of the matrix; however, these measurements are adaptive to the scene. At an increased cost of optical complexity, adaptive sensing promises accurate results with far fewer measurements than CS.

Hyperspectral imaging architectures.

Several architectures have been proposed for CS acquisition of HSIs. The Dual-Disperser Coded Aperture Snapshot Spectral Imager (DD-CASSI) [Gehm et al., 2007] obtains a single image, multiplexed in both spatial and spectral domains by dispersing the image using a prism, passing it through a coded aperture and then recombining using a second prism. In contrast, the Single Disperser CASSI (SD-CASSI) [Wagadarikar et al., 2008] which relies on a single prism that does a spatial coding using a binary mask followed by a spectral dispersion with a prism. Baek et al. [2017] disperse the image by placing a prism right before an SLR camera. The HSI is then reconstructed by studying the dispersion of color at the edges in the obtained RGB image. Takatani et al. [2017] instead propose a snapshot imager that relies on a combination faced of reflectors with with filters. Various other snapshot techniques have been proposed which rely on the general idea of space-spectrum multiplexing [Cao et al., 2016; Lin et al., 2014a; Jeon et al., 2016]. While snapshot imagers require only a single image, they often produce HSIs with reduced spatial or spectral resolutions.

Significant improvements can be obtained by acquiring multiple measurements instead of a single snapshot image. Kittle et al. [2010] obtaining multiple SD-CASSI like measurements by moving the coded aperture. Li et al. [2012] relied on spatially-multiplexed spectral measurements of the scene to reconstruct the HSI. Lin et al. [2014b] improved upon spatially-multiplexed CS by separately coding spatial and spectral domains. To this end, all existing methods trade-off spatial resolution, or spectral resolution, or take long durations to capture. A key insight is that most of these methods are non-adaptive — a sharp contrast to the proposed approach. Table 1 provides an overview of various HS imaging strategies and their relative merits. We next discuss the concept of Krylov subspaces for low-rank approximation of matrices, which motivates iterative and adaptive techniques and paves way to the proposed method.

Krylov subspaces.

Central to the proposed method is a class of techniques, collectively referred to as Krylov subpaces, for estimating singular vectors of matrices. Recall that the Singular Value Decomposition (SVD) of a matrix

is given as , where and are orthonormal matrices, referred to as the singular vectors, and is a diagonal matrix of singular values. Krylov subspace methods allow for efficient estimation of the singular values and vectors of a matrix and enjoy two key properties. First, the singular values and vectors are computed using operators that probe the matrix via left and right multiplications with vectors, i.e., we do not need direct access to the elements of the matrix . Second, the top singular values and vectors of a low-rank matrix can be estimated using a small set of matrix-vector multiplications. These two properties are invaluable when the matrix is very large or when it is implicitly represented using operators or, as is the case in this paper, the matrix is the scene’s HSI and we only have access to optical implementations of the underlying matrix-vector multiplications.

There are many variants of Krylov subspace techniques which differ mainly on their robustness to noise and model mismatches. The techniques in this paper are based on an implementation called the Lanczos bidiagonalization with full orthogonalization [Golub and Kahan, 1965; Hernandez et al., 2007]. Algorithm 1 summarizes this technique.

Left and right multiplication operators to the matrix , target rank , and total iterations
Intialize as a non-zero unit-norm vector
for  to  do
      (right multiplication)
     Orthogonalize with respect to the set
      (left multiplication)
     Orthogonalize with respect to the set
end for
, where
return (truncated SVD)
Algorithm 1 Lanczos bidiagonalization with full orthogonalization
Optical computing of low-rank signals.

Matrix-vector and matrix-matrix multiplications can often be implemented as optical systems. Such systems have been used for matrix-matrix multiplication [Athale and Collins, 1982], matrix inversion [Rajbenbach et al., 1987]

, as well as computing eigenvectors

[Kumar and Casasent, 1981]. Of particular interest to our paper is the optical computing of the light transport operator using Krylov subspace methods [O’Toole and Kutulakos, 2010]. The light transport matrix represents the linear mapping between scene illumination and a camera observing the scene. Each column of the matrix is the image of the scene when only a single illuminant is turned on. Hence, given a vector that encodes the scene illumination, the image captured by the camera is given as . By Helmholtz reciprocity, if we replaced every pixel of the camera by a light source and every illuminant with a camera pixel, then the light transport associated with the reversed illumination/sensing setup is given as . Hence, by co-locating a projector with the camera and a camera with the scene’s illuminants, we have access to both left- and right-multiplication of the light transport operator with vectors; we can now apply Krylov subspace techniques for optically estimating a low-rank approximation to the light transport matrix. This delightful insight is one of the key results in [O’Toole and Kutulakos, 2010].

This paper proposes a translation of the ideas in [O’Toole and Kutulakos, 2010] to hyperspectral imaging. However, as we will see next, this translation is not straightforward and requires novel imaging architectures.

3. Optical Krylov Subspaces for Hyperspectral Imaging

In this section, we provide a high-level description of optical computing of HSIs using Krylov subspace methods.


We represent HSIs in three different ways:

  • [leftmargin=*]

  • — a real-valued function over 2D space and 1D spectrum ,

  • — a three-dimensional tensor with

    spatial “pixels” and spectral bins; this is a sampling of the three-dimensional function, and

  • — a matrix with rows and columns, such that each column corresponds to the vectorized image at a specific spectrum.

The goal is to optically build the following two operators:

  • [leftmargin=*]

  • Spectrally-coded imager — Given a spectral code , we seek to measure the image given as


    The image corresponds to a grayscale image of the scene with a camera whose spectral response is .

  • Spatially-coded spectrometer — Given a spatial code , we seek to measure a spectral measurement given as


    The measurement corresponds to the spectral measurement of the scene, where-in the spectral profile of each pixel is weighted by the corresponding entry in the spatial code .

Note that these two operators correspond to left and right multiplications in Algorithm 1; hence, we can optically implement Algorithm 1 to obtain top singular vectors and values of the HSI matrix .

Number of measurements required.

To obtain a rank- approximation of the matrix , we would require at least spatially-coded spectral measurements — each of dimensionality , and spectrally-coded images — each of dimensionality . Hence, the number of measurements required by the approach is proportional to and, over traditional Nyquist sampling, it represents a reduction in measurements by a factor of


Given that the complexity of the scene is encoded in its low-rank, we can envision dramatic reductions in measurements required over Nyquist-sampling techniques especially when sensing at high spatial and spectral resolutions (see Table 1).

Figure 3. Schematic diagram of simultaneous spatio-spectral measurements with a coded aperture. The diffraction grating disperses light along x-axis. The image of the scene is formed on plane P1. The coded aperture is placed in P2, which introduces a diffraction blur in spatial plane P3, and dictates the spectral profile formed on the plane P4. A slit or an open aperture on P2 is not a good choice for simultaneously high spatial and spectral resolution. Instead, we show design of novel pupil aperture design which enables simultaneous high spatial and spectral resolution.
Challenges in implementing operators and .

Spatially-coded spectral measurements have been implemented in the context of compressive hyperspectral imaging [Sun and Kelly, 2009]. Here, light from a scene is first focused onto a spatial light modulator, that performs spatial coding, and then directed into a spectrometer. For spectral coding at a high-resolution, we could replace the sensor in a spectrometer with a spatial light modulator; subsequently, we can form and measure an image of the coded light using a lens. However, high-resolution spectrometers invariably use a slit aperture that produces a large one-dimensional blur in the spatial image due to diffraction. We show in Section 4 that simultaneous spatio-spectral localization is not possible with either a slit or an open aperture. This leads our discussion to the design of optimal binary coded apertures which enable high spectral and spatial resolutions. Subsequently, in Section 6, we present the design of KRISM — a novel and versatile imaging system, and validate its performance in Section 7.

4. Coded apertures for simultaneous sensing of space and spectrum

In this section, we introduce an optical system capable of simultaneously resolving space and spectrum at high resolutions. Along the way, we explain why traditional systems for measuring images and spectrum are mutually incompatible.

4.1. Optical setup

The ideas proposed in this paper rely on the optical setup shown in Figure 15 which is a slight modification of a traditional spectrometer. An objective lens focuses a scene onto the its image plane, that we denote as P1. This is followed by two 4 relays with a coded aperture placed on the first pupil plane, P2, and a diffraction grating placed at the plane marked as P3. We are interested in the intensity images formed at the planes marked at the “rainbow plane” P4 and the “spatial plane” P5, and their relationship to the image formed on P1, the coded aperture, and the grating parameters.

We assume that the field formed on the plane P1 is incoherent and, hence, we only need to consider its intensity and how it propagates, and largely ignore its phase. Let be the intensity of the field as a function of spatial coordinates and wavelength . Let be the aperture code placed at the plane P2, be the groove density (measured in grooves per unit length) of the diffraction grating in P3, and be the focal length of the lenses that form the 4 relays. The hyperspectral field intensity at the plane P4 is given as


where is the scene’s overall spectral content defined as

The intensity field at the spatial plane P5 is given as



is the 2D spatial Fourier transform of the aperture code

, and denotes two-dimensional spatial convolution along and axes. These expressions arise from Fourier optics [Goodman, 2005] and their derivations are provided in the supplemental material.

Image formed at the rainbow plane P4.

A camera with spectral response placed at the rainbow plane would measure


where . Here, the dimensionless term provides a scaling of the spectrum of the scene and indicates the resolving power of the diffraction grating. For example, we used a focal length and a grating with groove density grooves/mm for the prototype discussed in Section 6; here, . This implies that the spectrum is stretched by a factor of . Therefore, a 1nm of the spectrum maps to 30 m, which is about 6-7 pixel-widths on the cameras that we used. The key insight this expression provides is that the image is the convolution of the scene’s spectrum — denoted as a 1D image — with the aperture code . This implies that we can measure the spectrum of the scene, albeit convolved with the aperture code on this plane; this motivates our naming of this plane as the rainbow plane.

Image at the spatial plane P5.

A camera with the spectral response placed at the spatial plane P5 would measure


is a “spatial image” in that spectral components of the HSI have been integrated out. Hence, we refer to P5 as the spatial plane.

Implementing KRISM operations.

The derivation above suggests that we get a spatial image of the scene formed at the spatial plane P5 and a spectral profile at the rainbow plane P4. We can therefore build the two operators central to KRISM by coding the light at one of the planes while measuring it at the other. For the spectrally-coded imager , we will place an SLM at the rainbow plane P4 while measuring the image, with a camera, at P5. For the spatially-coded spectrometer , we will place an SLM at P3 — which optically identical to P5 — while measuring the image form at P4.

Figure 4. We implemented the setup shown in Figure 15 to verify the effect of different pupil codes. An open aperture leads to sharp spatial images, but the spectrum is blurred, as is evident from the spectrum of LED + 532nm laser. On the other hand, a slit offers high spectral resolution, but the spatial image is blurred. Optimized codes offer invertible spectral blur, and at the same time, invertible spatial blur.
Effect of the aperture code on the scene’s HSI.

Introducing an aperture code on the plane P2 can be interpreted as distorting the scene’s HSI in two distinct way. First, a spectral blur is introduced whose point spread function (PSF) is a scaled copy of the aperture code . Second, a spatial blur is introduced for each spectral band whose PSF is the power spectral density (PSD) of the aperture code, suitably scaled. With this interpretation, the images formed on planes P4 and P5 are a spectral and spatial projection, respectively, of this new blurred HSI. Our proposed technique measures a low-rank approximation to this blurred HSI and we can, in principle, deblur it to obtain the true HSI of the scene. However, the spatial and spectral blur kernels may not always be invertible. As we show next, the choice of the aperture is critical and that traditional apertures such as a slit in spectrometry and an open aperture in imaging will not lead to invertible blur kernels.

4.2. Failure of slits and open apertures

We now consider the effect of the traditional apertures used in imaging and spectrometry — namely, an open aperture and a slit, respectively — on the images formed at the rainbow and the spatial planes. Suppose that the aperture code is a box function of width mm and height mm, i.e.,

Its Fourier transform is the product of two sincs

The spatial image is convolved with the PSD scaled by and so, the blur observed on it has a spatial extent of units. Suppose that and m, the observed blur is . The rainbow plane image , on the other hand, simply observes a box blur whose spatial extent is . Armed with these expressions, we can study the effect of an open and a slit apertures on the spatial and rainbow images.

(a) Optimized code and aperture for .
(b) Raw measurements.
(c) Deconvolved spectrum and image respectively.
Figure 5. Binary code and measured image for an optimally imperceptible code of length 32. Invertible codes ensure that the spectral as well as spatial blur can be deconvolved. Spectrum was deconvolved using Wiener deconvolution, and spatial images were deconvolved using TV prior. Optimized codes offer high spatial as well as spectral resolution.
Scenario #1 — An open aperture.

Suppose that mm, then we can calculate the spatial blur to be m in both height and width, and hence, we can expect a very sharp spatial image of the scene. The blur on the rainbow image has a spread of mm; for relay lenses with focal length mm and grating with groove density grooves/mm, this would be equivalent of a spectral blur of nm. Hence, we cannot hope to achieve high spectral resolution with an open aperture.

Scenario #2 — A slit.

A slit is commonly used in spectrometers; suppose that we use a slit of width m and height mm. Then, we expect to see a spectral blur of nm. The spatial image is blurred along the y-axis by a m blur and along the x-axis by a m blur; effectively, with a m pixel pitch, this would correspond to a 1D blur of 100 pixels. In essence, the use of a slit leads to severe loss in spatial resolution.

Figure 4 shows images formed at the rainbow and spatial planes for various aperture codes. This validates our claim that conventional imagers are unable to simultaneously achieve high spatial and spectral resolutions due to the nature of the apertures used. We next design apertures with carefully engineered spectral and spatial blurs, which can be deblurred in post-processing.

4.3. Design of aperture codes

We now design an aperture code that is capable of resolving both space and spectrum at high-resolutions. Our use of coded apertures is inspired by seminal works in coded photography for motion and defocus deblurring [Raskar et al., 2006; Veeraraghavan et al., 2007; Levin et al., 2007].


Recall that the rainbow plane image is a convolution between a 1D spectral profile and a 2D aperture code . This convolution is one dimensional, i.e., along the -axis; hence, we can significantly simplify the code design problem by choosing an aperture of the form

with being as large as possible. The choice of the rect function along the -axis leads to a high light throughput. In addition to this, from the separability of Fourier transformations as well as convolutions, we can show that the resulting spatial blur along direction is compact.

For ease of fabrication, we further restrict the aperture code to be binary and of the form


where when and zero otherwise. Hence, the mask design reduces to finding an -bit codeword . The term , with units in lengths, specifies the physical dimension of each bit in the code. We fix its value based on the desired spectral resolution. For example, for mm and grooves/mm, a desired spectral resolution of nm would require m.

Our goal is to design masks that enable the following:

  • [leftmargin=*]

  • High light throughput. For a given code length , we seek codes with large light throughput which is equal to the number of ones in the code word

  • Invertibility of the spatial and spectral blur. The code is designed such that the resulting spatial and spectral blur are both invertible.

An invertible blur can be achieved by engineering its PSD to be flat. Given that the spectrum is linearly convolved with , a -point DFT of the code word captures all the relevant components of the PSD of . Denoting this -point DFT of as , we aim to maximize its minimum value in magnitude. Recall from (7) that the spatial PSF is the power spectral density (PSD) of , with suitable scaling. Specifically, the Fourier transform of spatial blur is given by , where is the linear autocorrelation of and represents spatial frequencies. From (8), we get,


where is the discrete linear autocorrelation of . Thus, it is sufficient to maximize to obtain an invertible spatial blur.

We select an aperture code that leads to invertible blurs for both space and spectrum by solving the following optimization problem:


under the constraint that the elements of are binary-valued, and is a constant. For code length sufficiently small, we can simply solve for the optimal code via exhaustive search of all code words. For our optical implementation, we used and an exhaustive search for the optimal code took over a day. The resulting code and its performance in delivering high spatial and spectral resolutions is shown in Figure 5 and 6; we used m and mm for this result. However, such a brute force optimization is not scalable for larger codes. Instead of searching for optimal codes, we can search for approximately optimal codes by iterating over a few candidate solutions. This strategy has previously been explored in [Raskar et al., 2006], where 6 million candidate solutions are searched for a 52-dimensional code. Another alternative would be to start with optimal codes for other applications, such as M-sequences and MURA codes, and perturb them by flipping bits and searching for the best solution.

Figure 6 shows the frequency response of both spectral and spatial blurs for the 32-dimensional optimized code. The advantages of optimized codes are immediately evident — an open aperture has several nulls in spectral domain, while a slit attenuates all high spatial frequencies. The optimized code retains all frequencies in both domains, while increasing light throughput throughput.

(a) Frequency response of spectral blur
(b) Frequency response of spatial blur
Figure 6. Frequency response of spatial and spectral blur for various coded apertures. Width of the slit was set to m, while that of open aperture was set to mm. The length of optimized code was 32-bits, with each bit being m wide, which gives rise to a mm wide aperture. We assume that a slit can resolve up to 1nm. In the graph, cycles/nm corresponds to a spectral resolution of 1nm, and hence the frequency response of the slit falls off after cycles/nm. Similarly, for the given set of parameters, the maximum spatial resolution is and hence is shown till cycles/mm. For spectral measurements, a slit has a flat frequency response, while an open aperture has lot of nulls. On the other hand, for spatial measurements, an open aperture has no nulls, whereas a slit attenuates almost all high frequencies. Our optimized codes have a fairly flat frequency response for spectral blur, while no nulls for spatial blur, which is ideal for high resolution spatio-spectral measurements.

5. Synthetic experiments

We tested KRISM via simulations on four different datasets and compared it against alternate approaches for hyperspectral imaging.


We used the hyperspectral data set in [Arad and Ben-Shahar, 2016], which consists of several high spatial and spectral resolution hyperspectral images covering 519 bands in visible and near IR wavelengths. We downsampled the HSI to to keep computation with CASSI-type simulations tractable. We also used datasets from [Choi et al., 2017] with 31 spectral bands to compare with learning-based techniques. Finally, we present one example from [SpecTIR, [n. d.]] to compare KRISM against Row/Column CS proposed in [Fazel et al., 2008]. Several more simulation results have been shown in supplementary material.

Competing methods.

We compared KRISM against four competing CS hyperspectral imaging techniques. All methods were simulated with 60dB readout and photon noise and 12-bit quantization. Specifics of each simulation model are given below:

  1. [leftmargin=*]

  2. KRISM: We performed a rank-4 approximation of the HSI with 6 spatial and 6 spectral measurements. Diffraction blur due to coded aperture was introduced both in spectral and spatial profiles. Deconvolution was then done using Wiener deconvolution in both spectral and spatial domains.

  3. Row/Col CS [Fazel et al., 2008]: As with KRISM, we performed a rank-4 approximation of the HSI by computing random Gaussian projections with 6 spatial and 6 spectral measurements. Diffraction blur due to coded aperture was introduced as well.

  4. CASSI [Choi et al., 2017]: We used the Single Disperser CASSI (SD-CASSI) architecture from [Wagadarikar et al., 2008] for obtaining a single coded image and recovered the HSI using spectral prior from [Choi et al., 2017]. For data from [Arad and Ben-Shahar, 2016], we the number of spectral bands to 31, as recovery with more than 31 bands lead to highly inaccurate results.

  5. CASSI++ [Kittle et al., 2010]: We used the multi-frame CASSI architecture for obtaining coded images, and recovered the HSI with sparsity prior in wavelet domain. As with CASSI, we reduced the number of spectral bands to 31.

  6. SPC++ [Sun and Kelly, 2009]: We obtained spatially-multiplexed spectral measurements with random permuted Hadamard matrix and recovered the HSI with sparsity in wavelet domain.

(a) Arad and
SNR: 8.9dB,
(c) CASSI++
SNR: 17.3dB,
(d) SPC++
SNR: 20.7dB,
(e) Row/Col CS
SNR: 20.7dB,
SNR: 29.0dB,
(g) Spectrum at marked point
(h) Choi et al.
SNR: 14.22dB,
(j) CASSI++
SNR: 15.9dB,
(k) SPC++
SNR: 14.9dB,
(l) Row/Col CS
SNR: 23.1dB,
SNR: 31.2dB,
(n) Spectrum at marked point
Figure 7. Comparison of reconstructed images for two datasets. “CASSI” represents Single Disperser CASSI, recovered using spectral prior [Choi et al., 2017]. CASSI++ uses multiple spatio-spectral images [Kittle et al., 2010] and was reconstructed with sparsity in wavelet domain. SPC++ represents spatially-multiplexed measurements [Li et al., 2012] and was reconstructed with sparsity in wavelet domain. Row/Col CS represents random row and column projections [Fazel et al., 2008], and KRISM is the proposed method. All experiments were performed with 60dB readout noise and poisson noise.
Simulation results.

Figure 7 shows simulation results with various CS techniques for two datasets. We define reconstruction SNR as

where is the Frobenius norm and is the recovered version of . While SD-CASSI with spectral prior shows promising results with small number of spectral bands, performance quickly degrades once the bands are increased. CASSI++ and SPC++ perform better under varying conditions, but the gains are only visible under low compression ratios (). For large number of spectral bands, only KRISM and Row/Col CS have good RSNR scores. We hence compared the two methods on the SpecTIR dataset [SpecTIR, [n. d.]] that has 178 spectral bands. As seen in Figure 8, KRISM outperforms Row/Col CS by about 8dB, thus establishing the superiority of the proposed adaptive imaging technique. Based on these simulations, we can conclude that KRISM is indeed a compelling methodology when the spatial or spectral resolution is very high — a key requirement for several hyperspectral image processing algorithms. When the number of spectral bands are smaller, the gains are modest but, none-the-less, present. In the next section, we provide an optical schematic for implementing KRISM.

(a) Ground truth ()
(b) Row/col CS
(17.6dB, )
(25.8dB, )
(d) Spectrum at the marked location.
Figure 8. Comparision of Row/Col CS vs KRISM for large number of spectral bands. We took one of SpecTIR datasets[SpecTIR, [n. d.]], which consists of 178 spectral bands, making a good candidate for KRISM. Simulations were done with 60dB readout noise, photon noise, and diffraction blur on spatial images and spectra. For the same compression ratio, KRISM outperforms Row/Col CS by 10dB.

6. The KRISM Optical setup

We now present an optical design for implementing the two operators presented in Section 3 and analyzed in Section 4. For efficiencies in implementation, we propose a novel design that combines both operators into one compact setup.

Figure 9 shows a schematic that uses polarization to achieve both operators with a single SLM and a single camera. First, in Figure 9(a), an SLM is placed 2 away from the grating, and an image sensor 2 away from the SLM, implementing spectrally coded spatial measurement operator . In Figure 9(b), light follows an alternate path where in the SLM is 4 away from the grating; the camera is still 2 away from the SLM. This light path allows us to achieve the spatial-coded spectral measurement operator . The two light pathways are combined using a combination of polarizing beam splitters (PBS) and liquid crystal rotators (LC). The input light is pre-polarized to be either S-polarized or P-polarized. When the light is P-polarized, the SLM is effectively 2 units away from the grating leading to implementation of , the spectrally-coded imager. When the light is S-polarized, the SLM is 4 units away, provided the polarizing beamsplitter, PBS 3 was absent. To counter this, an LC rotator is placed before PBS 3 that rotates S-polarization to P-polarization when switched on. Hence, when S-light is input, in conjugation with the rotator being switched on, we achieve the operator , spatially-coded spectrometer. By simultaneously controlling the polarization of input light and the LC rotator, we can implement both and operators with a single camera and SLM.

Figure 10 shows our lab prototype with the entire light pathway including the coded aperture placed in the relay system between the objective lens and diffraction grating. The input polarization is controlled by using a second LC rotator with a polarizer, placed before the diffraction grating. Finally, an auxiliary camera is used to image the pattern displayed on the SLM. This camera is used purely for alignment of the pattern displayed on the SLM. A detailed list of components can be found in the supplemental material.


Our optical setup requires three broad calibration processes. The first one is camera to SLM calibration. We used an auxiliary camera (Component 12 in Figure 10) that is directly focused on the SLM for this purpose. The second one is calibration of wavelengths. We used several narrowband filters to figure out the location of wavelengths. Finally, third one is radiometric calibration. We used a calibrated Tungsten-Halogen light source to estimate the spectral response of the setup. A detailed description of the calibration procedure can be found in the supplementary material.

Figure 9. Proposed optical setup in spectral coding (a) and spatial coding (b) mode. The optical method relies on polarization to switch between the two types of coding. When the input light is S-polarized, the LC rotator is switched off, enabling spectrally coded spatial measurements. When the input light instead is P-polarized, the LC rotator is turned on, which enables spatially coded spectral measurements. The input light polarization is controlled by a second LC rotator placed before the grating. With novel use of LC rotators, our optical setup enables dual coding of hyperspectral scenes with a single camera-SLM pair.
Figure 10. Photograph of the optical setup we built. (a) is the relay optics which includes the coded aperture. (b) shows the optics from diffraction grating to measurement camera. Components have been marked, grouped and labeled for convenience. The optical paths for spectral as well as spatial coding shown in Fig. 9 have been overlaid in (b) for easy understanding.
System characterization.

Spectral resolution (FWHM) of the setup was computed using several 1nm narrowband filters across visible wavelengths. Our optical setup provided an FWHM of 2.9nm. Spatial resolution was computed by capturing photo of a Siemen star, and then deconvolving with a pointspread function, obtained by capturing image of a pinhole. The frequency at 30% of Modulation Transfer Function, MTF30 was found to be an average of 0.4 line pairs/pixel. All computation details, as well as relevant figures, can be found in the supplementary material.

Diffraction due to LCoS pattern.

Since the SLM is placed at the conjugate plane of either spectral or spatial measurements, the displayed pattern introduces diffraction blur, which potentially makes the measurement model non-linear. To counter this, we add a constant offset to both positive and negative patterns, which makes the diffraction blur compact enough that the non-linearities can be neglected. Since the model is linear, the added offset does not change the final answer.

Figure 11. Data capture during measurement process for a rank-4 approximation of the “Color checker” scene, with iteration number shown in parenthesis. Positive part of data is shown in red and the negative part is shown in blue. KRISM alternates between acquiring spectral and spatial measurements to compute both spatial and spectral singular vectors. The first four iterations involve capturing the middle wavelengths, since they have the highest magnitude. The next set of iterations capture the blue side and the red side. The same is reflected in the spatial images as well.
Spectral deconvolution.

Measurements by our optical system return spectra at each point, convolved by the aperture code. To get the true spectrum, we deconvolved the measured singular vector using a smoothness prior. The specific objective function we used:


where is the true spectrum, is the measured spectrum, is the aperture code, is the first order difference of , and is weight of penalty term. Solution to 11 was computing using conjugate gradient descent. Higher favors smoother spectra, and hence is preferred for illuminants with smooth spectra, such as tungsten-halogen bulb or white LED. For peaky spectra such as Compact Fluorescent Lamp (CFL), a lower value of is preferred. In our experiments, we found to be appropriate for peaky spectra, whereas, was appropriate for experiments with tungsten-halogen illumination. We provided performance of deconvolution with some other algorithms in the supplementary section.

Spatial deconvolution.

Equation (7) suggests that the spatial blur kernel varies across different spectral bands. More specifically, the blur kernels at two different spectra are scaled versions of each other. However, we observed that the variations in blur kernels were not significant when we image over a small waveband — for example, the visible waveband of nm. Given this, we approximate the spatial blur as being spectrally independent, which leads to the following expression:

where is the spatial blur. We estimated the spatial blur kernel by imaging a pinhole and subsequently deconvolved the spatial singular vectors. We used a TV prior based deconvolution using the technique in [Bioucas-Dias and Figueiredo, 2007] using the image of a pinhole as the PSF. Details of the deconvolution procedure are in the supplementary section.

7. Real Experiments

We present several results from real experiments which show the effectiveness of our method. Specifically, we evaluate the ability to measure singular vectors with high accuracy, ability to get high spatial resolution and high spectral resolution. Unless specified, experiments involved a capture of a rank-4 approximation of the HSI, with 6 spectral and 6 spatial measurements. Apart from reconstruction SNR, we present Spectral Angular Mapper (SAM) [Yuhas et al., 1992] similarity between spectra measured by our optical setup and that measured with a spectrometer. SAM between two vectors and is defined as . Since this can be treated as high-dimensional approximation of angle between two vectors, we use SAM to evaluate closeness of spatial and singular vectors as well. We computed the singular vectors with our own implementation of Algorithm 1 in Matlab, with and implemented as function handles. The routine was initialized with all-ones spatial image to speed up convergence. The spatial resolution was pixels and the spectral resolution was bands between 400nm to 700nm, with 3 nm FWHM. The PSF for image blur was estimated by placing a pinhole in front of the camera. Deconvolution was then done using TV-penalty on spatial singular vectors. For verifying spectroradiometric capabilities, we obtained spectral measurements at a small set of spatial points using an Ocean Optics FLAME spectrometer.

Visualization of Lanczos iterations

Figure 11 shows iterations for the “Color checker” scene in Figure 13. The algorithm starts with capture of the brightest parts of the image, corresponding to the spectralon, and the white and yellow patches. Consequently, by iteration 5, the blue and red parts of the image are isolated. The iterations are representative of the signal energy in various wavelengths. Maximum energy is concentrated in yellow wavelengths, due to tungsten-halogen illuminant and spectral response of the camera. This is then followed by the red wavelengths, and finally the blue wavelengths.

(a) Setup with RGB image in inset
(b) First singular vector
(c) Second singular vector
(d) Third singular vector
Figure 12. Comparison of singular vectors captured via spectrally Hadamard-multiplexed sensing and KRISM for the dice scene. the left image singular vector is from Hadamard multiplexed data and the right one is from KRISM. Blue represents negative values and red represents positive values. KRISM method required capturing a total of 6 spectral and 6 spatial measurements to construct 4 singular vectors. While the Nyquist sampling method took a total of 59 minutes, KRISM took under 5 minutes. The SAM value between the singular vectors was less than .
Comparison of measured singular vectors.

We obtain the complete hyperspectral image through a permuted Hadamard multiplexed sampling in the spectral domain for comparison with ground-truth singular vectors. We chose a scene with four colored dice for this purpose, shown in Figure 12 (a). We then computed 4 singular vectors of spectrally Hadamard-multiplexed data. Figure 12 shows a comparison of the spatial and spectral singular vectors. The singular vectors obtained via Krylov subspace technique are close to the ones obtained through Nyquist sampling. On an average, the reconstruction accuracy between KRISM and Hadamard multiplexing was found to be greater than 30dB, while the angle between the singular vectors was no worse than , with the top three singular vector having an error of or small. While Hadamard sampling method took 49 minutes for 256 measurements, KRISM took under 2 minutes for 6 spatial and 6 spectral measurements, thus offering a speedup of . Notice that Hadamard multiplexing gives a increase in speed up, and hence, KRISM offers a speed up of over Nyquist methods.

Color checker.

Since our setup is optimized for viewing in 400nm-700nm, we evaluated our system for color reconstruction of the 24-color Macbeth color chart. The Macbeth color chart consists of a wide gamut of colors in visible spectrum that are spectrally well separated, and forms a good test bench for visible spectrometry We placed the “Color passport” and spectralon plug in front of our camera and illuminated it with a tungsten-halogen DC light source. The spectralon has a spectrally flat response, and hence helps estimate the spectral response of the illuminant+spectrometer system. This enables measurement of true radiance of the color swatches. Since the spectra is smooth, we used least squares recovery of the spectrum, with penalty on the first difference of spectral singular vectors. The captured data was then normalized by dividing spectrum of all points with the spectrum of the spectralon. Figure 13 shows the captured image against reference color chart. Also shown are spectra at select locations plotted along with ground truth spectra. On an average, the RSNR between spectra measured by KRISM and that measured by spectrometer is greater than 19dB, while the SAM is less than .

(a) RGB image
(b) Reference RGB image
(c) Spectra at marked points
Figure 13. Macbeth color chart. Spectra is shown at four locations, compared with spectrometer readings. The SNR is 24dB or higher and the SAM between KRISM spectra and spectrometer readings is less than .
Peaky spectrum illumination

We imaged a small toy figurine of “Chopper”, placed under compact fluorescent lamp illumination (CFL), which has a very peaky illumination, to test high spatio-spectral resolving capability. Figure 1 shows the rendered RGB image and spectra at a representative location. Spectra at a selected spatial point, as measured by KRISM, and Flame spectrometer have been shown as well. The SAM between spectrum measured by KRISM and that measured by spectrometer was found to be . Notice that the location of the peaks, as well as the heights match very well. Indeed, the chopper example establishes the high spatio-spectral resolution capabilities of KRISM.

Other real experiments.

Figure 14 shows several real world examples captured with our optical setup, with a diverse set of objects. For verification with groudtruth, we captured spectral data at select spatial locations. The “Dice” and “Objects” scene captures several more colorful objects with high texture. The zoomed-in pictures show the spatial resolution, while the comparison of spectra highlights the fidelity of our system as a spectral measurement tool. “Ace” scene was captured by placing the toy figurine under CFL illuminant, which is peaky. We could not obtain groundtruth with a spectrometer, as the toy was too small to reliably probe with a spectrometer. The peaks are located within 2nm of groundtruth peaks, and the relative heights of the peaks match the underlying color. “Crayons” scene consists of numerous colorful wax crayons illuminated with a tungsten-halogen lamp. The closeness of spectra w.r.t spectrometer readings shows the spectral performance of our setup. Finally, “Feathers” consists of several colorful feathers illuminated by tungsten-halogen lamp. The fine structure of feathers is well captured by our setup. Across the board, our setup promises high spatial as well as spectral resolution. Most importantly, all this data was obtained with only 6 spatial and 6 spectral measurements, a compression over Nyquist method.

Figure 14. Real data captured with our optical setup. We show the physical setup used for capturing the data, rendered RGB image with some interesting patches zoomed in, and spectra at some points, compared with a spectrometer. The results are promising, as the spectra is very close to spectrometer readings (SNR ¿ 20dB), and the spatial images are captured in high resolution.

8. Discussion and conclusion

We presented a novel hyperspectral imaging methodology called KRSIM, and provided an associated novel optical system for enabling optical computation of hyperspectral scenes to acquire the top few singular vectors in a fast and efficient manner. Through several real experiments, we establish the strength of KRISM in three important aspects: 1) the ability to capture singular vectors of the hyperspectral image with high fidelity, 2) the ability to capture an approximation of the hyperspectral image with or fast acquisition rate compared to Nyquist sampling, and 3) the ability to measure simultaneously at high spatial and spectral resolution. We believe that our setup will trigger several new experiments in adaptive imaging for fast and high resolution hyperspectral imaging.

Added advantages.

There are two more advantages to KRISM. One, since we capture the top few singular vectors directly, there is a data compression from the acquisition itself. Two, the only recovery time involves deconvolution of the spectra, which is far less than the time required for recovery of hyperspectral images from CS measurements.

Effect of photon noise

Although Krylov subspace based methods are very robust to noise [Simoncini and Szyld, 2003], the quality of the singular vectors degrades as the rank of acquisition is increased. This is primarily due to photon noise, as we progressively block most of the energy contained in initial singular vectors. This can be mitigated by increasing the exposure time of measurements for higher singular vectors. All said, the problem of noisy higher singular vectors exists with any kind of sampling scheme and hence needs separate attention via a good noise model.

9. Acknowledgement

The authors would like to acknowledge Prof. Ioannis Gkioulekas (Robotics Institute, Carnegie Mellon University) for the valuable feedback. The authors acknowledge support via the NSF CAREER grant CCF-1652569, the NGIA grant HM0476-17-1-2000, and the Intel ISRA on compressive sensing.


  • [1]
  • Arad and Ben-Shahar [2016] Boaz Arad and Ohad Ben-Shahar. 2016. Sparse Recovery of Hyperspectral Signal from Natural RGB Images. In

    European Conf. Computer Vision

  • Athale and Collins [1982] Ravindra A Athale and William C Collins. 1982. Optical matrix–matrix multiplier based on outer product decomposition. Appl. optics 21, 12 (1982), 2089–2090.
  • Baek et al. [2017] Seung-Hwan Baek, Incheol Kim, Diego Gutierrez, and Min H Kim. 2017. Compact single-shot hyperspectral imaging using a prism. ACM Trans. Graphics 36, 6 (2017), 217.
  • Baraniuk [2007] Richard G Baraniuk. 2007. Compressive sensing. IEEE Signal Processing Magazine 24, 4 (2007), 118–121.
  • Bioucas-Dias and Figueiredo [2007] José M Bioucas-Dias and Mário AT Figueiredo. 2007. A new TwIST: Two-step iterative shrinkage/thresholding algorithms for image restoration. IEEE Trans. Image processing 16, 12 (2007), 2992–3004.
  • Cao et al. [2016] Xun Cao, Tao Yue, Xing Lin, Stephen Lin, Xin Yuan, Qionghai Dai, Lawrence Carin, and David J Brady. 2016. Computational snapshot multispectral cameras: toward dynamic capture of the spectral world. IEEE Signal Processing Magazine 33, 5 (2016), 95–108.
  • Chakrabarti and Zickler [2011] A. Chakrabarti and T. Zickler. 2011. Statistics of Real-World Hyperspectral Images. In

    IEEE Conf. Computer Vision and Pattern Recognition

  • Choi et al. [2017] Inchang Choi, Daniel S. Jeon, Giljoo Nam, Diego Gutierrez, and Min H. Kim. 2017. High-Quality Hyperspectral Reconstruction Using a Spectral Prior. ACM Trans. Graphics 36, 6 (2017), 218:1–13.
  • Cloutis [1996] EA Cloutis. 1996. Review Article Hyperspectral geological remote sensing: evaluation of analytical techniques. International J. Remote Sensing 17, 12 (1996), 2215–2242.
  • Fazel et al. [2008] M Fazel, E Candes, B Recht, and P Parrilo. 2008. Compressed sensing and robust recovery of low rank matrices. In Asilomar Conf. Signals, Systems and Computers.
  • Finlayson et al. [1994] Graham D Finlayson, Mark S Drew, and Brian V Funt. 1994. Color constancy: generalized diagonal transforms suffice. JOSA A 11, 11 (1994), 3011–3019.
  • Gehm et al. [2007] ME Gehm, R John, DJ Brady, RM Willett, and TJ Schulz. 2007. Single-shot compressive spectral imaging with a dual-disperser architecture. Optics Express 15, 21 (2007), 14013–14027.
  • Golbabaee and Vandergheynst [2012] Mohammad Golbabaee and Pierre Vandergheynst. 2012. Hyperspectral image compressed sensing via low-rank and joint-sparse matrix recovery. In ICASSP.
  • Golub and Kahan [1965] Gene Golub and William Kahan. 1965. Calculating the singular values and pseudo-inverse of a matrix. J. of the Society for Industrial and Appl. Mathematics, Series B: Numerical Analysis 2, 2 (1965), 205–224.
  • Goodman [2005] Joseph W Goodman. 2005. Introduction to Fourier optics. Roberts and Company Publishers.
  • Harsanyi and Chang [1994] Joseph C Harsanyi and C-I Chang. 1994. Hyperspectral image classification and dimensionality reduction: An orthogonal subspace projection approach. IEEE Trans. Geoscience and Remote Sensing 32, 4 (1994), 779–785.
  • Harwit and Sloane [1979] Martin Harwit and Neil J Sloane. 1979. Hadamard transform optics.
  • Hernandez et al. [2007] V Hernandez, JE Roman, A Tomas, and V Vidal. 2007. Restarted Lanczos bidiagonalization for the SVD in SLEPc. STR-8, Tech. Rep. (2007).
  • Hui et al. [2018] Zhuo Hui, Kalyan Sunkavalli, Sunil Hadap, and Aswin C. Sankaranarayanan. 2018. Illuminant Spectra-based Source Separation using Flash Photography. In IEEE Intl. Conf. Computer Vision and Pattern Recognition (CVPR).
  • Jeon et al. [2016] Daniel S Jeon, Inchang Choi, and Min H Kim. 2016. Multisampling compressive video spectroscopy. In Computer Graphics Forum.
  • Kauvar et al. [2015] Isaac Kauvar, Samuel J Yang, Liang Shi, Ian McDowall, and Gordon Wetzstein. 2015. Adaptive color display via perceptually-driven factored spectral projection. ACM Trans. Graphics (TOG) 34, 6 (2015), 165–1.
  • Kim et al. [2012] Min H Kim, Todd Alan Harvey, David S Kittle, Holly Rushmeier, Julie Dorsey, Richard O Prum, and David J Brady. 2012. 3D imaging spectroscopy for measuring hyperspectral patterns on solid objects. ACM Trans. Graphics (TOG) 31, 4 (2012), 38.
  • Kittle et al. [2010] David Kittle, Kerkil Choi, Ashwin Wagadarikar, and David J Brady. 2010. Multiframe image estimation for coded aperture snapshot spectral imagers. Appl. optics 49, 36 (2010), 6824–6833.
  • Kumar and Casasent [1981] BVK Vijaya Kumar and David Casasent. 1981. Eigenvector determination by iterative optical methods. Applied optics 20, 21 (1981), 3707–3710.
  • Levin et al. [2007] Anat Levin, Rob Fergus, Frédo Durand, and William T Freeman. 2007. Image and depth from a conventional camera with a coded aperture. ACM transactions on graphics (TOG) 26, 3 (2007), 70.
  • Li et al. [2012] Chengbo Li, Ting Sun, Kevin F Kelly, and Yin Zhang. 2012. A compressive sensing and unmixing scheme for hyperspectral data processing. IEEE Trans. Image Processing 21, 3 (2012), 1200–1210.
  • Lin et al. [2014a] Xing Lin, Yebin Liu, Jiamin Wu, and Qionghai Dai. 2014a. Spatial-spectral encoded compressive hyperspectral imaging. ACM Trans. Graphics 33, 6 (2014), 233.
  • Lin et al. [2014b] Xing Lin, Gordon Wetzstein, Yebin Liu, and Qionghai Dai. 2014b. Dual-coded compressive hyperspectral imaging. Optics letters 39, 7 (2014), 2044–2047.
  • Mohan et al. [2008] Ankit Mohan, Ramesh Raskar, and Jack Tumblin. 2008. Agile spectrum imaging: Programmable wavelength modulation for cameras and projectors. In Computer Graphics Forum.
  • O’Toole and Kutulakos [2010] Matthew O’Toole and Kiriakos N Kutulakos. 2010. Optical computing for fast light transport analysis. ACM Trans. Graph. 29, 6 (2010), 164.
  • Pan et al. [2003] Zhihong Pan, Glenn Healey, Manish Prasad, and Bruce Tromberg. 2003. Face recognition in hyperspectral images. IEEE Trans. Pattern Analysis and Machine Intelligence 25, 12 (2003), 1552–1560.
  • Rajbenbach et al. [1987] Henri Rajbenbach, Yeshayahu Fainman, and Sing H Lee. 1987. Optical implementation of an iterative algorithm for matrix inversion. Appl. optics 26, 6 (1987), 1024–1031.
  • Raskar et al. [2006] Ramesh Raskar, Amit Agrawal, and Jack Tumblin. 2006. Coded exposure photography: motion deblurring using fluttered shutter. In ACM Transactions on Graphics (TOG), Vol. 25. 795–804.
  • Rasti et al. [2013] Behnood Rasti, Johannes R Sveinsson, Magnus O Ulfarsson, and Jon Atli Benediktsson. 2013. Hyperspectral image denoising using a new linear model and sparse regularization. In Geoscience and Remote Sensing Symposium, IEEE Intl.
  • Saragadam et al. [2017] Vishwanath Saragadam, Jian Wang, Xin Li, and Aswin Sankaranarayanan. 2017. Compressive spectral anomaly detection. In Intl. Conf. Comp. Photography.
  • Simoncini and Szyld [2003] Valeria Simoncini and Daniel B Szyld. 2003. Theory of inexact Krylov subspace methods and applications to scientific computing. SIAM J. Scientific Computing 25, 2 (2003), 454–477.
  • SpecTIR [[n. d.]] SpecTIR. [n. d.]. SpecTIR, Advanced Hyperspectral and Geospatial Solutions. ([n. d.]).
  • Sun and Kelly [2009] Ting Sun and Kevin Kelly. 2009. Compressive sensing hyperspectral imager. In Computational Optical Sensing and Imaging.
  • Takatani et al. [2017] Tsuyoshi Takatani, Takahito Aoto, and Yasuhiro Mukaigawa. 2017. One-shot Hyperspectral Imaging using Faced Reflectors. In CVPR.
  • Tarabalka et al. [2010] Yuliya Tarabalka, Jocelyn Chanussot, and Jon Atli Benediktsson. 2010. Segmentation and classification of hyperspectral images using watershed transformation. Pattern Recognition 43, 7 (2010), 2367–2379.
  • Veeraraghavan et al. [2007] Ashok Veeraraghavan, Ramesh Raskar, Amit Agrawal, Ankit Mohan, and Jack Tumblin. 2007. Dappled photography: Mask enhanced cameras for heterodyned light fields and coded aperture refocusing. ACM Trans. Graph. 26, 3 (2007), 69.
  • Wagadarikar et al. [2008] Ashwin Wagadarikar, Renu John, Rebecca Willett, and David Brady. 2008. Single disperser design for coded aperture snapshot spectral imaging. Appl. Optics 47, 10 (2008), B44–B51.
  • Waters et al. [2011] Andrew E Waters, Aswin C Sankaranarayanan, and Richard Baraniuk. 2011. SpaRCS: Recovering low-rank and sparse matrices from compressive measurements. In Adv. Neural Info. Processing Systems.
  • Winter [1999] Michael E. Winter. 1999. N-FINDR: An algorithm for fast autonomous spectral end-member determination in hyperspectral data. (1999).
  • Yasuma et al. [2010] Fumihito Yasuma, Tomoo Mitsunaga, Daisuke Iso, and Shree K Nayar. 2010. Generalized assorted pixel camera: postcapture control of resolution, dynamic range, and spectrum. IEEE trans. on image processing 19, 9 (2010), 2241–2253.
  • Yuhas et al. [1992] Roberta H Yuhas, Alexander FH Goetz, and Joe W Boardman. 1992. Discrimination among semi-arid landscape endmembers using the spectral angle mapper (SAM) algorithm. (1992).
  • Zhao and Yang [2015] Yong-Qiang Zhao and Jingxiang Yang. 2015. Hyperspectral image denoising via sparse representation and low-rank constraint. IEEE Trans. Geoscience and Remote Sensing 53, 1 (2015), 296–308.

Appendix A Appendix

The appendix is organized as follows.

  • [leftmargin=*]

  • Section B — Derivation of spatio-spectral blur. We provide an in-depth derivation of the spatial and spectral blur relationship due to a coded aperture mentioned in Section 4.

  • Sections C — Code design. We provide details on the design of coded aperture, comparisons with alternate codes such as M-sequences, and specifications of the deconvolution technique used for both space and spectrum.

  • Section D — Details of optical implementation. We provide a comprehensive list of components used for building the setup.

  • Section E —Explanation of design choices. We discuss some of the design considerations for implementing our optical setup.

  • Section F — Calibration. This section serves as a guide to calibrate the proposed optical setup.

  • Section G — Real results. We provide additional visualization of the real data shown in Section 7.

  • Section H — Additional simulation results. We present some more simulations with emphasis on performance across a diverse set of datasets.

Appendix B Coded apertures for simultaneous sensing of space and spectrum

Figure 15. Schematic of an optical setup that simultaneously captures spatial images and spectral profiles. P1 is the image plane of the objective lens, P2 contains spatial frequencies of the image. We place a coded aperture, at this plane. P3 contains the image plane again, but this time blurred by the coded aperture. We place a diffraction grating at this plane to disperse light into different wavelengths. P4 then contains the resultant spectrum and P5 once again contains the spatial image, which is simply a copy of P3. We use this schematic to explain the derivation of measurements on planes P4 and P5.

Section 4 provided a brief derivation of the effect of coded aperture on spatial and spectral blur. We do a more rigorous proof here. Figure 15 explains the optical setup that we will consider for all our derivations. In particular, we place a coded aperture at plane P2, which introduces diffraction blur in both spatial and spectral measurements. We obtain spectral measurements on P4 and spatial measurements on P5. The goal of this section is to derive the relationship between measurements on P4 and P5 to the coded aperture and the scene’s HSI.

b.1. Assumptions

The first assumption is that the image is spatially incoherent, an assumption that is realistic for most real world settings. This implies that the spatial frequencies add up in intensities and not amplitudes. Next, the diffraction grating is assumed to disperse light along x-axis, which means that there is no dispersion along the y-axis. Finally, we assume an ideal thin lens model for all the relay lenses . This implies that the Fourier transform property of ideal thin lenses holds for all computations.

b.2. Basics

The derivation in the sequel relies on the so called Fourier transform property of lenses [Goodman, 2005]. Suppose that the complex-valued phase field at the plane is given as , where denote spatial coordinates on the plane and denotes the wavelength of light. Lets place an ideal thin lens with focal length at and whose optical axis is aligned to the axis. The Fourier transforming property says that the complex phasor field that is formed at the plane is given as

where is the 2D Fourier transform of along the first two dimensions.

We can compute the 2D Fourier transform of using the scaling property.

The negative signs in the argument of comes from the Fourier transform being the Hermitian of the inverse Fourier transform. If we now placed a second ideal thin lens of focal length at , then the field at can be computed as

The assembly above, with two lenses at and is referred to as a system. As we see above, the system replicates the field at at , barring a flip of the coordinate axis; this property is useful for the following discussion.

b.3. Propagation of signal

We will use Figure 15 as a guide for the derivation. An objective lens focuses a scene onto its image plane, denoted as P1. Assuming that all light is incoherent, let the complex phasor at P1 be denoted as ; note that intensity of this complex field is the hyperspectral image that we seek to measure, i.e.,

Since we assume an incoherent model, we analyze the system for a point light source and then extend it to a generic image by adding up only intensities.

Field at Plane P1.

Consider a point light source at with complex amplitude . The overall phasor field at P1 is given as

Field at Plane P2.

Using Fourier transform property of lens, we get the field on plane P2 to be,


where is the continuous 2D Fourier transform of along the first two dimensions. The field just after the coded aperture is given by,

Field at Plane P3.

Using Fourier transform property of lens a second time, the field just before the diffraction grating is


where is the continuous 2D FT of . Since the diffraction grating is assumed to disperse along x-axis, we model it as a series of infinite slits, given by,


where is the grove density of the diffraction grating, measured in grooves per unit length. The factor ensures that light does not get amplified as it propagates through the setup. The field just after the diffraction grating is hence given by,

Field at the Rainbow Plane P4.

To calculate the field at P4, we first need an expression for , the 2D Fourier transform of .


The field on plane P4 is given as:


where ”” represents continuous 2D convolution. Since we are only interested in the first order of diffraction, we set k = 1, giving us,

Field at the Spatial Plane P5.

Finally, the field on plane P5 is given by,


b.4. Measurement by camera

Note that a camera can only measure intensity of the field. Assuming a camera with spectral response , the measurement on plane P4 is


where is a 2D convolution and

Extending to a generic incoherent image case, we get the following expression,




We can observe from (22) that the image formed at the rainbow plane P4 is a convolution of the scene’s spectrum modified with the camera response as well as with the square of the aperture code.

Figure 16. Simulation of spatial blur at nm and nm for the optimized code. Simulation was done with a lens of focal length mm, and a camera pixel width of . Our system was designed only for a narrow range of nm, over which the blur was almost the same. Hence we assumed the spatial blur to be spectrally invariant.

Similarly, the intensity on plane P5 is given by,


where can be seen as the spatial PSF. Extending to a generic incoherent image case, we get the following expression,


The expression above suggests that the image associated with each spectral channel is convolved with a different blur kernel; further, the blur kernel for different wavelengths are simple scaled versions of each other. This implies that we need to design codes and deconvolve them for each channel separately. However, this can be avoided for the following two reasons. First, since the kernels are scaled versions of each other, if one of them is invertible then so are the rest. Second, since our optical setup imaged within a narrow spectral band of nm, the variance in spatial blur is not significant. Since the blur of coded aperture is compact, the pixellation makes the differences in blur sizes insignificant. Figure 16 shows a comparison of spatial blur of optimized code at two representative wavelengths of nm and nm as seen by a camera with pixel width. As is evident, the blur size is largely invariant to wavelength, and hence we assumed that the spatial blur is spectrally invariant, and hence, the image formed at plane P5 is given as

For optimizing the spatial blur kernel, we chose a design wavelength of nm. However, if we were to image over a larger span of wavelengths, such as nm, spectral bands have to be deconvolved individually.

Appendix C Code selection

(a) Optimized code and aperture for .
(b) Raw measurements.
(c) Deconvolved spectrum and image respectively
(Spectrum SNR: 21.5dB; Spatial RSNR: 25.3)
(d) Spatially-compact code and aperture for .
(e) Raw measurements.
(f) Deconvolved spectrum and image respectively
(Spectrum SNR: 15.3dB; Spatial RSNR: 25.4)
(g) M-sequence code and aperture for .
(h) Raw measurements.
(i) Deconvolved spectrum and image respectively
(Spectrum SNR: 20.1dB; Spatial RSNR: 25.3)
Figure 17. Comparison of performance of various codes. We compare optimized codes, spatially compact codes and M-sequences. Simulations were performed with added readout and poisson noise. Spectral deconvolution was done with Wiener deconvolution, while spatial deconvolution was done with TV-prior. While spatially-compact codes (row 2) offer better performance in spatial images, spectral deconvolution accuracy is very low. M-sequences (row 3) perform moderately well for both spectral and spatial deconvolution. However, optimized codes perform the best overall (row 1).

We characterize performance of some coded aperture for spatio-spectral resolution, and then explain deconvolution of spectrum and space for the real optical setup.

c.1. Choice of codes

We briefed a way of obtaining doubly invertible codes for high spatial and spectral resolution in Section 4.3. In this section, we show an alternate design, and evaluate their performance for spatial deconvolution and spectral deconvolution.

c.1.1. Spatially compact codes

Instead of pursuing spatially invertible codes, we can optimize for codes which introduce compact spatial blur. In such a case, the goal would be to suppress side lobes of the PSF of spatial PSF. Let be the spatial PSF created by . If be the first and second maximum peak heights of , then maximizing the ratio leads to spatially imperceptible blur. Combined with an invertible spectral blur, we formulate the overall objective function as:


where is a constant. As with doubly invertible codes, we brute forced the optimal solution.

c.1.2. M-sequences

Maximal length sequences, or M-sequences for short, are optimal codes when using circular convolution. Their PSD is flat and hence is desirable as blur functions. However, since our convolution is linear, M-sequences are not necessarily the optimal choice.

(a) Spectra of narrowband filters.
(b) Central wavelength and FWHM of the filters
Figure 18. Spectra of some narrowband filters with datasheet central wavelength and FWHM (in parenthesis) provided in legend. We used Weiner deconvolution to obtain the true spectra. (b) tabulates the estimated central wavelength and FWHM, along with datasheet values. The accuracy of central wavelength and FWHM establishes the accuracy as well as high-resolution capabilities of our optical setup.
(a) White Light Emitting Diode light source
(b) White Compact Fluorescent Lamp
(c) Tungsten-halogen light source..
Figure 19. Spectra of commonly found light sources: (a) Tungsten-halogen, (b) Light Emitting Diode (LED), and (c) Compact Fluorescent Lamp (CFL). Results are shown for Wiener deconvolution, penalized deconvolution and positive-constrained deconvolution. CFL shows some error in blue wavelengths, as the machine vision camera we used was not reliable for deep blue wavelengths. The deconvolved results are robust to choice of algorithm, as the aperture code was designed to be invertible.
(a) Before deconvolution
(b) TV-prior deconvolution
(c) Wiener deconvolution
(d) Richardson-Lucy deconvolution
Figure 20. Deconvolution results of a Siemen star with various deconvolution algorithms. Due to invertible nature of PSF, deconvolution is robust to choice of method. However, TV-prior gave best results in terms of MTF.
Figure 21. MTF plot before and after deconvolution. PSF was estimated by capturing image of a pinhole. Deconvolution was then done using a TV prior on the image gradients. There is a marked improvement in contrast ratio after deconvolution. The MTF30 value for both sectors jumps from 0.15 line pairs/mm to 0.5 line pairs/mm.

c.2. Performance comparison

We compare doubly invertible codes, spatially compact codes and M-sequences for their performance in spatial deconvolution and spectral deconvolution. To test spectral deconvolution, we created a spectrum with two closely spaced narrowband peaks and a broadband peak, and blurred them with various codes. Readout noise and shot noise were added to adhere to real world measurements. Finally, deconvolution was done with wiener filter.

To test spatial deconvolution, we used Airforce target and blurred with the scaled PSD of the pupil codes, and added noise. Deconvolution was done with a TV prior in all cases.

Figure 17 shows a comparison of performance for spectral and spatial deconvolution. As expected, doubly invertible codes perform the best for spectral deconvolution, while spatially compact codes perform worse. Spatially compact codes perform the best in this case, while double invertible codes come close. Since hyperspectral imaging requires good spatial as well as spectral resolution, we chose doubly invertible codes.

c.3. Spectral deconvolution

Recall that our optical setup measures a blurred version of the true spectrum. Specifically, if the aperture code is and the spectrum to be measured is , our optical setup measures , where is additive white gaussian noise. The addition of noise prevents us from simply dividing in Fourier domain. Fortunately, since the aperture code was designed to be invertible, it is fairly robust to noise. A naive solution, such as Wiener deconvolution, hence, works very well. If the noise is too high, or the spectra is known to be smooth, we can impose an penalty on the difference and solve the following optimization problem:


where is the measured spectrum, is the aperture code, is the spectrum to be recovered, and is the first difference of . Further priors, such as positivity constrains give better results as well. Figure 19 shows a comparison of spectra of various commonly available light sources, as well as a comparison with spectrometric measurements. We showed results for three forms of deconvolution, namely, Wiener deconvolution, regularized deconvolution, and positivity constrained deconvolution. Figure 18 shows results for some narrowband filters. We computed the central wavelength and Full Width Half Max (FWHM) for each filter and compared it against the numbers provided by the company. As expected, the FWHM of 1nm filters is between 2nm and 3nm, as the FWHM of our optical setup is 3nm. FWHM for 10nm filters and 40nm filters is close to the ground truth values.

c.4. Spatial deconvolution

The presence of a coded aperture introduces a blur in spatial domain, which is equal to the scaled power spectral density of the coded aperture. Our optimization procedure accounts for invertible spectral blur as well as invertible spatial blur. Hence, deconvolution is stable even in the presence of noise. While naive deconvolution procedures such as Wiener deconvolution or Richardson-Lucy work moderately well, we imposed total variance (TV) penalty on the edges to get more accurate results. Figure 20 shows blurred image of Siemen star, and deconvolution with TV-prior, Wiener and Richardson-Lucy algorithms. As with spectral deconvolution, spatial deconvolution is fairly robust to choice of algorithm. We chose TV-prior, as it returned the sharpest results. Figure 21 shows MTF before and after deconvolution, with TV-prior. Deconvolution signficantly improves the MTF30 value, which jumps from 0.15 line pairs/pixel to 0.5 line pairs/pixel.

Appendix D List of components

Figure 22 shows an annotated image of the optical setup we built along with a list of components along with their company and item number. The system was optimized for a central wavelength of 580nm and hence the relay arm till the diffraction grating has been tilted at with respect to the diffraction grating to correct for schiempflug. Lenses in the relay arm are tilted by with respect to the diffraction grating so that the objective can be aligned with the relay arm without any further tilt. The first beamsplitter (component 8) and the second turning mirror (component 10) have been placed on a kinematic platform to correct for misalignments in the cage system. It is of importance that we chose an LCoS instead of a DMD for spatial light modulation. The reasons:

  • [leftmargin=*]

  • Since the output after modulation by DMD is not rectilinear to the DMD plane, it introduces further scheimpflug, which is hard to correct.

  • DMD acts as a diffraction grating with Littrow configuration, as it is formed of extremely small mirror facets. This will introduce artifacts in measurements which are non-linear.

Figure 22. List of components for the KRISM optical setup. All the important components, their company and item number have been listed for reference. Construction component names such as cage plates, rods, posts and breadboards have been omitted for brevity.

Some more design considerations are enumerated below:

  1. [leftmargin=*]

  2. Lenses. We used 100mm achromats for all lenses except the last lens before cameras. Achromats were the most compact and economical choice for our optical setup, while offering low spatial and spectral distortion.

  3. Polarizing beam splitters. We used wire grid polarizing beamsplitters everywhere to ensure low dependence of spectral distortion on angle of incidence, and increase the contrast ratio.

  4. Using an objective lens for measurement camera. Note that a lens is placed between the LCoS and measurement sensor which converts spatially-coded image to spectrum and coded spectrum to spatial image. Instead of using another achromat, we used an objective lens set to focus at infinity. Since objective lenses are free of any distortions, and are optimized to focus at infinity, this significantly improves resolution of measurements.

  5. Diffraction grating. We used an off-the-shelf transmissive diffraction grating with 300 groves/mm, which offered most compact spectral dispersion without any overlap with higher orders. This ensured that there would be no spectral vignetting at any point in the setup. Further discussion about the choice of grove density is provided in E

  6. Polarization rotators. We bought off-the-shelf Liquid Crystal (Lc) shutters and peeled off the polarizers on either sides to construct polarization shutters. This is the most economic option, while offering contrast ratios as high as 400:1. However, the drawback is that the settling time is 330ms, which prevents their usage at very high rate. A natural workaround is to incorporate binary Ferroelectric shutters which have a low latency rate of 1ms. However, since ours was only a lab prototype, we decided to go with the cheaper option.

Appendix E Design considerations

We outline some design choices we made and the rationale behind them in this section.

e.1. Choice of code size

The pupil code has two free parameters, the length of the code and the pitch size, . The two parameters control the invertibility of spectrum and imperceptibility of spatial images. To understand our design choices, we present constrains and physical dimensions of various measurements. Let each lens in the optical setup have a focal length and aperture diameter . Let pixel pitch of measurement camera be . This implies that the camera can capture all spatial frequencies up to . Let the size of grating be in each dimension and its grove density be .

We capture wavelengths from to . The grating equation is given by, , where is the groves spacing and is the order of diffraction, 1 in our case. Solving for angular spread of spectrum, we get, The size of spectrum then is . The minimum resolvable wavelength is To avoid vignetting in a system, we require that the pupil plane be no larger than , giving us .