Target-based Hyperspectral Demixing via Generalized Robust PCA

02/26/2019 ∙ by Sirisha Rambhatla, et al. ∙ University of Minnesota 0

Localizing targets of interest in a given hyperspectral (HS) image has applications ranging from remote sensing to surveillance. This task of target detection leverages the fact that each material/object possesses its own characteristic spectral response, depending upon its composition. As signatures of different materials are often correlated, matched filtering based approaches may not be appropriate in this case. In this work, we present a technique to localize targets of interest based on their spectral signatures. We also present the corresponding recovery guarantees, leveraging our recent theoretical results. To this end, we model a HS image as a superposition of a low-rank component and a dictionary sparse component, wherein the dictionary consists of the a priori known characteristic spectral responses of the target we wish to localize. Finally, we analyze the performance of the proposed approach via experimental validation on real HS data for a classification task, and compare it with related techniques.



There are no comments yet.


page 1

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Hyperspectral (HS) imaging is an imaging modality which senses the intensities of the reflected electromagnetic waves corresponding to different wavelengths of the electromagnetic spectra, often invisible to the human eye. As the spectral response associated with an object/material is dependent on its composition, HS imaging lends itself very useful in identifying target objects/materials via their characteristic spectra or signature responses, often referred to as "endmembers" in the literature. Typical applications of HS imaging range from monitoring agricultural use of land, catchment areas of rivers and water bodies, food processing and surveillance, to detecting various minerals, chemicals, and even presence of life sustaining compounds on distant planets. However, often, these spectral signatures are highly correlated, making it difficult to detect region of interest based on these endmembers. Analysis of these characteristic responses to localize a target material/object serves as the primary motivation of this work.

I-a Model

We begin by formalizing the problem. Each HS image , is a stack of 2-D images each of size , as shown in Fig. 1(a). Here, is determined by the number of frequencies or frequency bands across which we measure the reflectances. Overall, each volumetric element or voxel

, of a HS image is a vector of length

, and represents the response of the material in a pixel of the 2-D grid to frequencies.

As a particular scene is composed of only a small number of objects/materials, the corresponding characteristic responses of each voxel are low-rank [1]. For example, while imaging an agricultural area, we would expect to record responses from materials like biomass, farm vehicles, roads, houses and water bodies and so on. Further, the spectra of complex materials can be assumed to be a linear mixture of the constituent materials [1, 2], i.e. HS image voxels can be viewed as being generated by a linear mixture model [3]. Since, a particular scene does not contain a large number of distinct materials, we can decompose the responses into a low-rank part and a component which is sparse in a known dictionary – a dictionary sparse part.

Fig. 1: The Hyperspectral image data-cube corresponding to the Indian Pines dataset. The top layer is depicted in RGB for visualization, however each slice is a grayscale image of reflectances.

Formally, let be formed by unfolding the HS image , such that, each column of corresponds to a voxel. Now, let arise as a result of a superposition of a low-rank component with rank , and a dictionary sparse component expressed here as , i.e.,


Here, is an a priori known dictionary composed of normalized characteristic responses of the material/object, or of the constituents of the material we wish to localize, and is the sparse coefficient matrix (also referred to as "abundances" in the literature) with non-zero elements globally, i.e. . Note that, can also be constructed by learning a dictionary based on the known spectral signatures of a target[4, 5, 6].

I-B Prior Art

The model shown in (1) is a close counterpart of a number of well-known problems. To start, in the absence of the dictionary sparse part, (1

) reduces to the popular problem of principal component analysis (PCA)

[7]. This problem also shares its structure with variants of PCA such as robust-PCA[8, 9] where

is absent, outlier pursuit

[10] where and is column sparse, and others [11, 12, 13, 14, 15, 16, 17, 18, 19]. On the other hand, the problem can be identified as that of sparse recovery [20, 21, 22], in the absence of the low-rank part . Following which, sparse recovery methods for analysis of HS images have been explored in [23, 24, 25, 26]. In addition, in a recent work [27], the authors further impose a low-rank constraint on the coefficient matrix . Further, applications of compressive sampling have been explored in [28], while [3] analyzes the case where HS images are noisy and incomplete.

The model described in (1) was introduced in [29] as a means to detect traffic anomalies in a network, wherein, the authors focus on a case where the dictionary is overcomplete, i.e., fat, and the rows of are orthogonal, e.g., . Further, the coefficient matrix is assumed to possess at most nonzero elements per row and column and nonzero elements overall.

In a recent work [30], we analyze the extension of [29] to a case where the dictionary has more rows than columns, i.e., is thin, while removing the orthogonality constraint for both the thin and the fat dictionary cases, when is small. The thin dictionary case is amenable to a number of applications, as often, we are looking for only a small number of signatures. For example, in our motivating application of HS imaging, we are interested in detecting the presence of a small number of spectral signatures corresponding to the target of interest. To this end, we focus our attention on the thin case, although a similar analysis applies for the fat case [30].

Note that, when the known dictionary is thin, we can alternatively multiply (1) on the left by the pseudo-inverse of , say , in which case, the model shown in (1) reduces to that of robust PCA, i.e.,


where and . Therefore, in this case, we can adopt a two-step procedure: (1) recover the sparse matrix

by robust PCA and, (2) estimate the low-rank part using the estimate of

. 111We denote this method of estimating as RPCA in our experiments. Models considered in (1) and (2) are two alternate methods of recovering the components for thin dictionaries. These two methods inherently solve different optimization problems. The proposed method is a one-shot procedure with guarantees to solve the problem, while the conditions under which the pseudo-inverse based procedure succeeds will depend on the interaction between and the low-rank part. Further, our experiments indicate that solving for model in (1) is more robust as compared to RPCA for the classification problem at hand, across different choice of dictionaries.

I-C Our Contribution

In this work, we present a one-step method for target detection in a hyperspectral image with the underlying model of (1). This is done by building upon the results of [30] for the underlying application. In particular, we show the conditions under which target detection is possible with thin dictionaries. In general, the method is also applicable to fat dictionaries and for other applications, which follow the underlying model. The rest of the paper is organized as follows. We formulate the problem, introduce the notation and present the theoretical result for the case at hand, i.e., the thin case, in section II. Next, we present the description of the dataset along with the experimental evaluations in section III. Finally, we conclude this discussion in section IV.

Ii Problem Formulation

Our aim is to recover the low-rank component , and the sparse coefficient matrix , given the dictionary , and samples generated according to the model described in (1). We utilize the structure of the components, and , to arrive at the following convex problem for ,


where, denotes the nuclear norm, and refers to the - norm, which serve as convex relaxations of rank and sparsity (i.e. -norm), respectively. For the application at hand, we present the theoretical result for the case when the number of dictionary elements . Analysis of a more general case can be found in [30]. For the this case, we assume the dictionary to be a frame, i.e. for any vector , we have


where and are the lower and upper frame bounds, respectively, with .

We begin by defining a few relevant subspaces, similar to those used in [29].

Method Threshold Performance at best operating point AUC
4 0.01 XpRA 0.30 0.979 0.023 0.989
RPCA 0.65 0.957 0.049 0.974
MF N/A 0.957 0.036 0.994
MF N/A 0.914 0.104 0.946
0.1 XpRA 0.8 0.989 0.017 0.997
RPCA 0.8 0.989 0.014 0.997
MF N/A 0.989 0.016 0.998
MF N/A 0.989 0.010 0.998
0.5 XpRA 0.6 0.968 0.031 0.991
RPCA 0.6 0.935 0.067 0.988
MF N/A 0.548 0.474 0.555
MF N/A 0.849 0.119 0.939
(a) Learned dictionary,
Method Threshold Performance at best operating point AUC
10 0.01 XpRA 0.6 0.935 0.060 0.972
RPCA 0.7 0.978 0.023 0.990
MF N/A 0.624 0.415 0.681
MF N/A 0.569 0.421 0.619
0.1 XpRA 0.5 0.968 0.029 0.993
RPCA 0.5 0.871 0.144 0.961
MF N/A 0.688 0.302 0.713
MF N/A 0.527 0.469 0.523
0.5 XpRA 1 0.978 0.031 0.996
RPCA 2.2 0.849 0.113 0.908
MF N/A 0.807 0.309 0.781
MF N/A 0.527 0.465 0.539
(b) Learned dictionary,
Method Threshold Performance at best operating point AUC
15 XpRA 0.3 0.989 0.021 0.998
RPCA 3 0.849 0.146 0.900
MF N/A 0.957 0.085 0.978
MF N/A 0.796 0.217 0.857
(d) Average performance
Mean St.Dev. Mean St.Dev. Mean St.Dev.
XpRA 0.972 0.019 0.030 0.014 0.991 0.009
RPCA 0.919 0.061 0.079 0.055 0.959 0.040
MF 0.796 0.179 0.234 0.187 0.814 0.178
MF 0.739 0.195 0.258 0.192 0.775 0.207
(c) Dictionary by sampling voxels,
TABLE I: Simulation results for the proposed approach (), robust-PCA based approach (RPCA), matched filtering (MF) on original data , and matched filtering on transformed data (MF), across dictionary elements , and the regularization parameter for initial dictionary learning step . Threshold selects columns with column-norm greater than threshold such that AUC is maximized. Further,

denotes the case where ROC curve was “flipped” (i.e. classifier output was inverted to improve performance).

Let the pair be the solution to the problem shown in (3). We define as the space of matrices spanning either the row or the column space of the low-rank component . Specifically, let

denote the singular value decomposition of

, then the space is defined as

Next, let be the space spanned by the matrices which have the same support (locations of non-zero elements) as , and let be defined as

Next, let , and be the orthogonal projection operator(s) onto the space of matrices defined above. In addition, we will use and to denote the projection matrices for the column and row spaces of , respectively, i.e., implying the following for any matrix ,

Indeed, there are situations under which we cannot hope to recover the matrices and . To characterize these scenarios, we employ various notions of incoherence, out of which, the first is the incoherence between the low-rank part, , and the dictionary sparse part, ,

where is small when these components are incoherent (which is good for recovery). The next two measures of incoherence can be interpreted as a way to avoid the cases where for , (a) resembles the dictionary , and (b) as the sparse coefficient matrix . To this end, we define respectively, the following to measure these properties,

where . Also, we define . Finally, for a matrix , we use for the spectral norm, where denotes the maximum singular value of the matrix , and .

Ii-a Theoretical Result

We now present our main theoretical result for the thin case, i.e., when , which was developed in [30]. We begin by introducing the following definitions and assumptions.

Definition D.1.

, where is defined as


Definition D.2.


Assumption A.1.

Assumption A.2.

Let , then

The following theorem establishes the conditions under which we can recover the pair successfully.

Theorem 1.

Consider a superposition of a low-rank matrix of rank , and a dictionary sparse component , wherein the dictionary , with , obeys the frame condition with frame bounds , the sparse coefficient matrix has at most non-zeros, i.e., , where , and , with parameters , , and . Then, solving the formulation shown in (3) will recover matrices and if assumptions A.1 and A.2 hold for any , as defined in D.1 and D.2, respectively.

In other words, Theorem 1 establishes the sufficient conditions for the existence of s to guarantee recovery of . We see that these conditions are closely related to the various incoherence measures , and , between , and . Note that, we have here an upper-bound on the global sparsity, i.e., , however, our simulation results in [30] on synthetic data reveal that the recovery is indeed possible even for higher ; See [30] for the details of the analysis. To this end, we propose a randomized analysis of the problem in (3) to improve these results as a future work.

Iii Experimental Evaluation

In this section, we evaluate the performance of the proposed technique on real HS data. We begin by describing the dataset used for the simulations, following which we describe the experimental set-up and conclude by presenting the results.

Iii-a Data

The Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) [31] sensor is one of the popular choices for collecting HS images for various remote sensing applications. In this present exposition, we consider the “Indian Pines” dataset [32], which was collected over the Indian Pines test site in North-western Indiana in the June of 1992. This dataset consists of spectral reflectances across bands in wavelength of ranges nm from a scene which is composed mostly of agricultural land, along with two major dual lane highways, a rail line and some built structures. This dataset is further processed by removing the bands corresponding to those of water absorption, and the resulting HS data-cube with dimensions is as visualized in Fig. 1. This modified dataset is available as “corrected Indian Pines” dataset [32], with the ground-truth containing classes; Henceforth, referred to as the “Dataset".

We will analyze the performance of the proposed technique for the identification of the stone-steel towers (class in the Dataset), shown as ground-truth in Fig. 2(b), which constitutes about voxels in the Dataset. We consider two cases for choosing the known dictionary – (1) when a (thin) dictionary is learned[4, 5, 6] based on the voxels (class-), and (2) when the dictionary is formed by randomly sampling voxels (from class-). This is to emulate the two ways in which we can arrive at the characteristics of a target – (1) the case where the exact signatures are not available, and/or there is noise, and (2) where we have access to the exact signatures we are looking for. Next, we form the data matrix by stacking each voxel of the image side-by-side, which results in a data matrix .

Normalization: For normalization, we divide the data matrix and the dictionary by to preserve the inter-voxel scaling. Next, we normalize the columns of , such that they are unit-norm. In practice, we can store the un-normalized dictionary , consisting of actual signatures of the target material, and can normalize it, as described above, after the HS image has been acquired.

Iii-B Experimental Setup

We evaluate and compare the performance of the proposed method (XpRA) with RPCA using transformed data (as described in section I-B), matched filtering (MF) based approach on data ( denoted by MF), and a MF based approach on , denoted by MF, via receiver operating characteristic (ROC), shown in Table I(a)-(d).

For the proposed technique, we employ the accelerated proximal gradient (APG) algorithm presented in [29] to solve the optimization problem shown in (3). Similarly, for RPCA we employ the APG with transformed data matrix , while setting ; See (2). With reference to selection of tuning parameters for the APG solver for XpRA (RPCA, respectively), we choose , (), , and scan through s in the range (), to generate the ROCs. The output of XpRA and RPCA is a data-cube corresponding to the low-rank part and the dictionary sparse part (which contains the region of interest). We threshold the resulting estimates of the sparse part , based on their column norms. We choose the threshold such that the area under the curve (AUC) is maximized for both cases (XpRA and RPCA).

In case of MF, we form the inner-product of the column-normalized data matrix , say , with the dictionary , i.e., , and select the maximum absolute inner-product per column. For MF, matched filtering corresponds to finding maximum absolute entry for each column of the column-normalized . Next, in both cases we scan through threshold values between to generate the ROC results shown in Table I.

Table I(a)-(b) show the ROC characteristics for the aforementioned methods for a dictionary learned using all voxels of class-. We evaluate the performance of the aforementioned techniques for two dictionary sizes, and , for three values of the regularization parameter , used for the initial dictionary learning step, i.e., and . (The parameter controls the sparsity during the initial dictionary learning step.)

Table I(c) shows the ROC results for the case where we randomly select voxel from class-

to form the known dictionary. Further, we show the overall performance in terms of the mean and standard deviation of ROC metrics (TPR, FPR and AUC), for each algorithm in Table 

I(d), i.e., across different dictionaries considered in Table I(a)-(c).

Iii-C Analysis

As described above, Table I shows the ROCs for the classification performance of the proposed method as compared to RPCA, MF and MF. We note that when we use which has been learned based on samples of class-, matched filtering based approaches under perform (with an exception for and ). RPCA performs well for and . For all other choices the proposed method performs the best. Even for cases in which it is not the best, it still is comparable to the best performer. Following this, we conclude that the performance of XpRA is most reliable across different the choices of the dictionaries. This is also evident by Table I(d) which shows the mean and and standard deviation of ROC metrics (TPR, FPR and AUC). We observe that the mean and standard deviation metrics for the proposed method are superior to other methods across the different dictionary choices.



(a) (c) (d)


(b) (e) (f)
Fig. 2: Recovery of the low-rank component and the dictionary sparse component for different values of for the proposed technique at slice of the data (a), (b) corresponding ground truth for class-. Panel (c) and (d),and (e) and (f) show the recovery of the the low-rank part and the dictionary part, for the at the best operating point and for smaller , respectively, dictionary formed from data voxels(Table I(c)).

There are other interesting recovery results which warrant our attention. For example, Fig. 2 shows the low-rank component and the dictionary sparse component recovered by the proposed method for two different values of , for the case where we form the dictionary by randomly sampling the voxels (Table I(c)). Interestingly, we recover the rail tracks/roads running diagonally on the top-right corner, along with some low-density housing; See Fig 2 (f). This is because the signatures we seek (stone-steel towers) are similar to those of the materials used in these structures. This further corroborates the applicability of the proposed approach in detecting the presence of a particular spectral signature in a HS image. However, this also highlights potential drawback of this technique. As XpRA is based on identifying materials with similar composition, it may not be effective in distinguishing between very closely related classes, say two agricultural crops.

Iv Conclusions

We present a generalized robust PCA-based technique to localize a target in a HS image, based on the a priori known spectral signature of the material we wish to localize. We model the data as being composed of a low-rank component and a dictionary sparse component. Here, the dictionary contains the a priori known spectral signatures of the target. In this work, we adapt the theoretical results of [30] and present the conditions under which such a decomposition recovers the two components for the HS demixing task. Further, we evaluate and compare the performance of the proposed method via experimental evaluations on real HS image data for a classification task.

V Acknowledgement

The authors graciously acknowledge support from the DARPA YFA, Grant N66001-14-1-4047.


  • [1] N. Keshava and J. F. Mustard, “Spectral unmixing,” IEEE Signal Processing Magazine, vol. 19, no. 1, pp. 44–57, Jan 2002.
  • [2] J. B. Greer, “Sparse demixing of hyperspectral images,” IEEE Transactions on Image Processing, vol. 21, no. 1, pp. 219–228, Jan 2012.
  • [3] Z. Xing, M. Zhou, A. Castrodad, G. Sapiro, and L. Carin, “Dictionary learning for noisy and incomplete hyperspectral images,” SIAM Journal on Imaging Sciences, vol. 5, no. 1, pp. 33–56, 2012.
  • [4] J. Mairal, F. Bach, J. Ponce, and G. Sapiro, “Online Learning for Matrix Factorization and Sparse Coding,”

    Journal of Machine Learning Research

    , vol. 11, pp. 19–60, 2010.
  • [5] B. A. Olshausen and D. J. Field, “Sparse Coding with an Overcomplete Basis Set: A Strategy Employed by V1?,” Vision Research, vol. 37, no. 23, pp. 3311–3325, 1997.
  • [6] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: Design of Dictionaries for Sparse Representation,” In Proceedings of SPARS, pp. 9–12, 2005.
  • [7] I. Jolliffe, Principal component analysis, Wiley Online Library, 2002.
  • [8] E. J. Candès, X. Li, Y. Ma, and J. Wright, “Robust principal component analysis?,” Journal of the ACM (JACM), vol. 58, no. 3, pp. 11, 2011.
  • [9] V. Chandrasekaran, S. Sanghavi, P. A. Parrilo, and A. S. Willsky, “Rank-sparsity incoherence for matrix decomposition,” SIAM Journal on Optimization, vol. 21, no. 2, pp. 572–596, 2011.
  • [10] H. Xu, C. Caramanis, and S. Sanghavi, “Robust pca via outlier pursuit,” in Advances in Neural Information Processing Systems, 2010, pp. 2496–2504.
  • [11] Z. Zhou, X. Li, J. Wright, E. J. Candès, and Y. Ma, “Stable principal component pursuit,” in Information Theory Proceedings (ISIT), 2010 IEEE International Symposium on. IEEE, 2010, pp. 1518–1522.
  • [12] X. Ding, L. He, and L. Carin, “Bayesian robust principal component analysis,” IEEE Transactions on Image Processing, vol. 20, no. 12, pp. 3419–3430, 2011.
  • [13] J. Wright, A. Ganesh, K. Min, and Y. Ma, “Compressive principal component pursuit,” Information and Inference, vol. 2, no. 1, pp. 32–68, 2013.
  • [14] Y. Chen, A. Jalali, S. Sanghavi, and C. Caramanis, “Low-rank matrix recovery from errors and erasures,” IEEE Transactions on Information Theory, vol. 59, no. 7, pp. 4324–4337, 2013.
  • [15] X. Li and J. Haupt, “Identifying outliers in large matrices via randomized adaptive compressive sampling,” Trans. Signal Processing, vol. 63, no. 7, pp. 1792–1807, 2015.
  • [16] X. Li and J. Haupt, “Locating salient group-structured image features via adaptive compressive sensing,” in IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2015.
  • [17] X. Li and J. Haupt, “Outlier identification via randomized adaptive compressive sampling,” in IEEE International Conference on Acoustic, Speech and Signal Processing, 2015.
  • [18] X. Li and J. Haupt, “A refined analysis for the sample complexity of adaptive compressive outlier sensing,” in IEEE Workshop on Statistical Signal Processing, 2016.
  • [19] X. Li, J. Ren, Y. Xu, and J. Haupt, “An efficient dictionary based robust pca via sketching,” Technical Report, 2016.
  • [20] B. K. Natarajan, “Sparse approximate solutions to linear systems,” SIAM Journal on Computing, vol. 24, no. 2, pp. 227–234, 1995.
  • [21] D. L. Donoho and X. Huo, “Uncertainty principles and ideal atomic decomposition,” IEEE Transactions on Information Theory, vol. 47, no. 7, pp. 2845–2862, 2001.
  • [22] E. J. Candès and T. Tao,

    “Decoding by linear programming,”

    IEEE Transactions on Information Theory, vol. 51, no. 12, pp. 4203–4215, 2005.
  • [23] Y. Moudden, J. Bobin, J. L. Starck, and J. M. Fadili, “Dictionary learning with spatio-spectral sparsity constraints,” in Signal Processing with Adaptive Sparse Structured Representations(SPARS), 2009.
  • [24] J. Bobin, Y. Moudden, J. L. Starck, and J. Fadili, “Sparsity constraints for hyperspectral data analysis: Linear mixture model and beyond,” 2009.
  • [25] R. Kawakami, Y. Matsushita, J. Wright, M. Ben-Ezra, Y. W. Tai, and K. Ikeuchi, “High-resolution hyperspectral imaging via matrix factorization,” in

    IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    , June 2011, pp. 2329–2336.
  • [26] A. S. Charles, B. A. Olshausen, and C. J. Rozell, “Learning sparse codes for hyperspectral imagery,” IEEE Journal of Selected Topics in Signal Processing, vol. 5, no. 5, pp. 963–978, Sept 2011.
  • [27] P. V. Giampouras, K. E. Themelis, A. A. Rontogiannis, and K. D. Koutroumbas, “Simultaneously sparse and low-rank abundance matrix estimation for hyperspectral image unmixing,” IEEE Transactions on Geoscience and Remote Sensing, vol. 54, no. 8, pp. 4775–4789, Aug 2016.
  • [28] M. Golbabaee, S. Arberet, and P. Vandergheynst, “Distributed compressed sensing of hyperspectral images via blind source separation,” in Forty Fourth Asilomar Conference on Signals, Systems and Computers, Nov 2010, pp. 196–198.
  • [29] M. Mardani, G. Mateos, and G. B. Giannakis, “Recovery of low-rank plus compressed sparse matrices with application to unveiling traffic anomalies,” IEEE Transactions on Information Theory, vol. 59, no. 8, pp. 5186–5205, 2013.
  • [30] S. Rambhatla, X. Li, and J. Haupt, “A dictionary based generalization of robust PCA,” in IEEE Global Conference on Signal and Information Processing (GlobalSIP). IEEE, 2016.
  • [31] Jet Propulsion Laboratory, NASA and California Institute of Technology, “Airborne Visible/Infrared Imaging Spectrometer,” 1987, Available at
  • [32] M. F. Baumgardner, L. L. Biehl, and D. A. Landgrebe, “220 Band AVIRIS Hyperspectral Image Data Set: June 12, 1992 Indian Pine Test Site 3, dataset available via,” Sept 2015.