The estimation and detection of constituting materials, i.e. end members (EMs) in multi- or hyperspectral imagery (we will use multispectral as synonym for both), and the unmixing of the given dataset with respect to extracted EMs, is an important step for classification and structural analysis in fields such as remote sensing and spectral microscopy. In spectral mixture analysis a linear mixture model is assumed and the spectral unmixing with respect to the constituent EMs provides their abundances at a per-pixel level representing the material fractions [1, 2]. Commonly, full-constrained linear unmixing is applied, yielding non-negative abundance values that sum up to one.
Endmember extraction (EE) algorithms extract EMs from the multispectral image directly . Without a-priori information, EE algorithms need to determine a minimal set of “pure” endmembers at acceptable computational costs, whose linear unmixing results in a proper reconstruction of the initial spectral image . The size of the EM set can be estimated by simultaneous extraction and unmixing approaches (guided by manual or statistic thresholds) or by automatic data driven decisions, such as virtual dimensionality methods .
In this paper, we propose condition-residuum diagrams that relate the EM matrix’s condition number and the root mean square error (RMSE) as residuum measure of the reconstruction after unmixing. Our approach is inspired by the observation that larger EM sets lead to a better reconstruction after unmixing, but at the cost of spectral redundancy that, on the down-side, makes unmixing, and thus the reconstruction result numerically unstable . Condition-residuum diagrams provide deeper insight into the relation between redundancy (or instability) and the residuum after unmixing for various EM sets for a given multispectral image. Based on condition-residuum diagrams, we propose an EM reduction algorithm that is applied to a given, over-complete EM set in order to semi-automatically identify the “best” subset of EMs. Here, “best” means that the desired EMs set exhibits a low residual error (after unmixing and reconstruction) and a low condition number (indicating numerical stability). Our greedy EM reduction approach determines a nested sequence of EM subsets yielding maximum stability at minimal residuals. Together, the condition-residuum diagram and the reduction algorithm provide quantified means of selecting a proper EM set and insight into the general composition of the multispectral image with respect to the unambiguity of its endmembers.
We evaluate our procedure using different multispectral datasets with three EE algorithms, showing the usability of our visual condition-residuum diagram and our EM reduction scheme with respect to the quality of the deduced EM sets.
Ii Related Work
projects spectra from the dataset onto randomly selected vectors in order to find vertices of a convex hull of the multispectral data. Orthogonal Subspace Projection (OSP) by Harsanyi and Chang recursively selects the maximum projection of the spectra in the subspace orthogonal to the span of the current EM set. The N-FINDR algorithm of Winter  is a simplex growing approach that selects and refines the EM set by maximizing the simplex’s volume. Similar to OSP, the Vertex Component Analysis (VCA) algorithm uses a subspace projection scheme, but generates an intermediate simplex that is used to identify the EMs via projection . The Iterative Error Analysis (IEA) of Neville et al.  is an iterative EE process that selects the pixel (or an averaged pixel set) within the image as new EM that exhibits the maximal residuum after unmixing.
Other approaches iteratively optimize EM sets using spectral unmixing. Based on a direct (iterative) EE algorithm, they use manual residuum thresholds, in-/stability thresholds, or data driven, statistical thresholds, to optimize the EM set. Van der Meer  presents an iterative spectral unmixing approach optimizing the EM set generated by PPI, by iteratively exchanging EMs according to the residuals error in their pixel neighborhood. Song et al.  present an EM optimization based on IEA, which excludes EMs with a low residuum gain in their IEA order and EMs with a small spectral angle to the first three EMs.
Plaza and Chang  investigate the influence of termination rules applied to EE algorithm with respect to the EM quality. They demonstrate that if the number of extracted EMs is too small, relevant spectra are not extracted and when the number is too high, interfering substances, i.e., very similar spectra are selected.
Berman et al.  introduced the statistical iterated constrained endmember (ICE) algorithm. This approach solves all tasks in parallel, i.e. endmember selection, unmixing and the determination of the number of endmembers, by combining statistical analysis with the attempt to optimally cover the simplex formed by the scene pixels’ spectra. Zare and Gader  extend the ICE algorithm by adding a sparsity promotion scheme. Both approaches generate “synthetic” endmembers that are in most cases not in the given data. In contrast, our method focus the selection of endmembers that are explicitly given in the data to be analyzed.
In general, even if the “correct” EM set size is known, both, EE algorithm and EM set optimization approaches, often do not extract all relevant EMs. To the best of our knowledge, no EE or optimization algorithm delivers a reliability measure that involves both, reconstruction quality (RMSE) and unmixing stability. This approach, as shown in this paper, is less sensitive to initial parameter setting (we have only one parameter, that we fixed) provided that the over-complete set is large enough.
Our method comprises of two items. The condition-residuum diagram that provides insight into the relation between the stability of spectral unmixing and the unmixing residuum is described in Sec. III-A. In Sec. III-B we introduce our EM reduction scheme applied to over-complete EM sets.
Condition-residuum-diagrams visualize the relation between the error measure of the image reconstruction after spectral unmixing and the condition number of the EM set that is a measure for the instability of the unmixing. Given an endmember-set consisting of EMs , is the EM matrix, in which the EMs are arranged as columns. We choose the Root Mean Square Error (RMSE) as residuum measure of with respect to the underlying multispectral image .
Here, is the Frobenius norm that is applied to the difference between the image reconstruction using the abundance matrix resulting from the spectral unmixing, and the multispectral image . denotes the number of pixel in image .
To measure the (in)stability of a given EM set
we choose the matrix condition number that measures the stability of the linear transformation given by a matrix, i.e., how much the output value of the linear function can change for a small change in the input argument. According to van der Meer and Jia, the condition number is a direct measure for collinearity of a given EM set. The matrix condition number of an EM set
is computed as the ratio between the largest and the smallest singular value of the matrixcomposed of the EMs in .
The condition-residuum-diagram plots the condition numbers and residuum values of several EM sets in order to assess their individual numerically instabil in the unmixing process in relation to their resulting reconstruction residual error (see Eq. (1)) after unmixing. The diagram supports the simultaneous evaluation both quality criteria and, thus, a revealing means for comparing different endmember sets. The “ideal” EM sets exhibits a condition number of and a residuum of . In practice, there are no ideal EM sets, thus an EM set either exhibits a significant residual error, i.e. it does not reconstruct the image very well, or the EM set is partially redundant, which restricts the numerical unmixing stability. Finding a “good” EM set can visually be interpreted as finding an EM set close to the ideal point in the condition-residuum-diagram (see discussion in Sec. IV).
Iii-B Reduction of Over-Complete EM Sets
Based on the condition-residuum-diagram, we propose an EM reduction scheme that is applied to an over-complete EM set and that results in a sequence of nested sets by iteratively removing endmembers .
The main idea in selecting an EM for removal of the current set is to optimize the remaining set to have as low as possible residual error (see Eq. (1)) and condition number. This approach follows the basic principle that an “optimal” EM set should describe the spectral variability of the dataset with a minimal EM set size . We solve this multi-critera optimization problem by combining both measures. Thus results from by removing given as
Reducing an EM set naturally results in a descending condition number and in an ascending RMSE. Thus, our optimization approach maximizes the gain in condition number and minimizes the loss in RMSE. Our scheme works on normalized measures as the absolute value in the measures are not comparable.
In our empirical evaluation we found a good choice for the -parameter. Therefore, we use in Sec. IV, which leads to EMs and selected for reduction that depend purely on the condition number , purely on the RMSE, and equally on both measures, respectively. Considering the combined reduction with , our procedure will select such that the residual error stays small, while the condition number decreases as much as possible, resulting in a more stable EM set.
We deliberately do not propose an “optimal” EM set size, based on our reduction approach. The “optimal” sizes of the EM sets used in our evaluation (Sec. IV) have slightly varying position in the condition-residuum-diagram, i.e. application specific considerations play an important role. Furthermore, even sophisticated automatic EM set size estimators such as HySime  are often far off the reference size (see Tab. I).
We evaluate the condition-residuum-diagrams and our EM reduction scheme using various sample datasets (see Sec. IV-A). In Sec. IV-B we present the main properties of the diagram and the reduction scheme and compare the different reduction schemes based on Eq. (2), i.e. using solely the condition number or RMSE, or the combined version.
Iv-a Datasets and Endmember Extraction Algorithms
Table I depicts the parameters of the dataset used for evaluation. We use datasets for which reference numbers for the “best” EM set size are known in literature (column ). We give the EM set sizes as estimated by the HySime algorithm  as further reference. The Cuperite dataset is online available at  and the other datasets at .
We choose the OSP, N-FINDR and VCA as EE algorithms for our evaluation, where we use our own implementation of OSP and the N-FINDR and VCA implementations of the Hyper Spectral Toolbox . As suggested by Plaza et al. , we deactivated the noise reduction stage of VCA for a fair comparison. Beside this we use the online available implementation  of constrained least squares unmixing of Chouzenoux et al. . For every EE algorithm we compute an over-complete endmember-set with twice the reference size (i.e. ), and reduce it using our greedy reduction algorithm (see Sec. III-B).
Iv-B Evaluation Scheme
To evaluate both, our condition-residuum diagram and our reduction algorithm, we plot different and additional information in the diagram that would not be determined in a practical use-case (see Fig. 1). We plot the reduction curves based on the removal of and (see Sec. III-B). Additionally, we display EM set resulting from EE algorithms that directly generate and EMs. For non-deterministic algorithms, i.e. N-FINDR and VCA, we generate 10 EM sets, for the deterministic OSP algorithm only one EM set with and EMs. For the Salinas-A dataset in Fig. 1, we additionally randomly generate all possible subset of containing EMs denoted as “Bruteforce”.
Note, that we crop the diagram to the area close to the ideal condition-residuum point and, thus, discard examples far off the region of good EM sets. Therefore, in some of the diagrams not all direct EM set extractions with and/or are cropped as well, if there condition-residuum values are out of the area of interest.
Iv-C Quality of Reduction Schemes
Considering the general shape of the reduction curves, all of them are nearly L-shaped, where the most interesting region is in the kink of the L, close to the optimal point (). The -parameter properly controls the impact of both measures on the reduction process, i.e. and RMSE. If the reduction completely relies on (), the reduction leads to a fast increase in RMSE, while the condition-number still stays on a high level. Thus, it is advisable to use higher values of leading to a higher influence of the relative RMSE measure in Eq. 2. The reduction schemes by () and () lead to good results. As expected, the reduction by tends to faster reduce the condition-number at the cost of a slightly increased RMSE. Comparing the full curves, the reduction by and by are alike. While the -reduction curve mainly runs below the reduction by in Figs (a)a, (b)b, (c)c, (e)e, (f)f, and (l)l, the opposite is the case in Figs. (d)d, (g)g, (h)h, (j)j, and (k)k. Also, when considering the kink of L-shaped curves, both methods can delivers more regular shapes, i.e., reducing by delivers superior results in Figs. (c)c, (f)f, and (l)l,whereas reduction by is better inFigs. (j)j, and (k)k.
Fig. 1 shows all random subsets of with EMs. Obviously, our reduction schemes based on the reduction of and select EM sets with as good as possible condition-number and RMSE among all possible EM subsets.
Direct EM set Extraction
Evaluating of our method versus direct EE algorithm that generate EMs, the combined reduction scheme (removing ) and the RMSE reduction (removing ) commonly exhibit good results. This is due to the fluctuation of these methods in generating the initial over-complete EM set. Considering the reduction by with direct EE results in EM sets with better condition-number and RMSE (Figs. (a)a, (b)b, (e)e, (f)f, (i)i, (j)j), with better RMSE but worse condition-number (Figs. (a)a, (b)b, (c)c, (d)d, (g)g) or with better condition-number but worse RMSE (Figs. (c)c, (h)h,(k)k, (l)l). Most of the latter two cases are due to the non-deterministic nature of the underlying EE algorithms, i.e. N-FINDR and VCA. See also Sec. V-C, where we discuss the specific situation in Figs. (c)c and (k)k.
V-a Optimal Size of Endmember-Sets
Our residuum-condition diagram provides a visual guidance in selecting EM sets with low RMSE and low condition number (high unmixing stability), i.e., EM sets that are close to the optimal case (RMSE and ). In all of our text cases, the reduction curves based on the removal of and result in are quite pronounced shape that indicates EM sets close to the theoretic optimum. Compared to the reference EM set sizes provided in literature, we see, that in some cases these reference sizes are located close to the main bend of the curve (see Fig. (c)c, (b)b, (c)c, (d)d,(f)f), while in other cases the reference EM set sizes are too conservative, i.e., smaller EM sets lead to more stable results at minimal loss in RMSE (see Fig. (a)a, (j)j, (k)k), or too progressive, i.e. larger EM sets lead to significant lower RMSE at minimal loss in stability (see Fig. (a)a, (b)b, (l)l). In general, our semi-automatic EM set selection approach easily supports the choice of the EM set size (and potentially the EM set itself) from an application perspective. It can easily be used in combination with automatic EM set size estimation algorithms .
V-B Algorithmic Complexity
Our reduction approach starting with an over-complete EM set requires unmixing steps at each level, where is the size of the current EM set (brute force testing would be of exponential order). As fully constrained unmixing is computationally quite exhaustive, we experimented using unconstrained unmixing, which is computationally far less demanding. In approximately 50% of our tests, the results have been qualitatively the same as with fully constrained unmixing, i.e. the reduction curve’s shape and relative location of the EM sets on both curves are very close. In the rest of the cases, both reduction curves significantly differ from each other, thus selecting the set size from the unconstrained unmixing may lead to wrong interpretations in these cases. Making our reduction scheme more efficient is part of our future research.
The result of our reduction approach strongly depends on the quality of the initial over-complete EM set . For non-deterministic EE algorithms, the spread of the initial EM sets may be quite significant; (e.g. Fig. (b)b). Two very specific cases are the N-FINDR result applied to the Kennedy-Space-Center dataset (Fig. (k)k) and the VCA result applied to the Pavia University dataset (Fig. (c)c). Here, the reduction schemes deliver significantly worse results then directly extracted EM set with the “optimal” set size . These results are quite counter-intuitive, as N-FINDR and VCA deliver worse results in terms of RMSE using a compared to EMs. Thus, it may be advisable to run any EE algorithm after having semi-automatically selected the EM set size using our reduction scheme.
V-D EE Algorithms Comparison
Even though it is not the goal of our paper to explicitly compare EE algorithms, our evaluation implies some tendencies. In some cases OSP delivers high quality results (e.g. Fig. (a)a), but in most cases its reconstruction quality falls behind that of N-FINDR and VCA. When directly generating EM sets, the spread of the N-FINDR results are less compared to VCA (see e.g. Figs (h)h and (h)h). Factoring out the spread of both, N-FINDR and VCA, these EE algorithms show a comparable performance. The evaluation shows every EE algorithm clearly benefits from our reduction scheme.
We introduced and analyzed the concept of condition-residuum diagrams in combination with an EM set reduction scheme based on combined condition and residuum optimization, that is applied to over-complete EM set. We show, that this approach can be used as visual guidance in selecting the EM set size and the EM set itself. We evaluated our approach for three common EE algorithms (OSP, N-FINDR and VCA) and with three different energy functionals for optimized EM reduction. Here, the RMSE-based and the combined RMSE-condition schemes show good results with a slight advantage in favor of the combined scheme. In the future, we will investigate data-driven approaches to steer the mixture parameter
, alternative optimization energy functionals and EM replacement approaches, as well as means of accelerating the costly full-constrained unmixing during reduction. Furthermore, combining our reduction approach with a spectral feature selection or spectral weighting approach might be beneficial.
This work was supported by the German Research Foundation (DFG) under research grant KO 2960/10-2.
-  J. M. Bioucas-Dias, A. Plaza, N. Dobigeon, M. Parente, Q. Du, P. Gader, and J. Chanussot, “Hyperspectral unmixing overview: Geometrical, statistical, and sparse regression-based approaches,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 5, no. 2, pp. 354–379, April 2012.
-  B. Somers, G. P. Asner, L. Titis, and P. Coppin, “Endmember variability in spectral mixture analysis: A review,” Remote sensing of environment, vol. 115, no. 7, pp. 1603–1616, 2011.
-  D. E. Sabol, J. B. Adams, and M. O. Smith, “Quantitative subpixel spectral detection of targets in multispectral images,” Journal of Geophysical Research: Planets, vol. 97, no. E2, pp. 2659–2672, 1992.
-  M. G. Asl and B. Mojaradi, “Virtual dimensionality estimation in hyperspectral imagery based on unsupervised feature selection,” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 3, p. 17, 2016.
-  A. Plaza, P. Martinez, R. Perez, and J. Plaza, “A quantitative and comparative analysis of endmember extraction algorithms from hyperspectral data,” IEEE Transactions on Geoscience and Remote Sensing, vol. 42, no. 3, pp. 650–663, March 2004.
-  J. W. Boardman, “Automating spectral unmixing of aviris data using convex geometry concepts,” in Summaries Annual JPL Air-borne Geosciences Workshop, 1993, pp. 11–14.
-  J. C. Harsanyi and C.-I. Chang, “Hyperspectral image classification and dimensionality reduction: An orthogonal subspace projection approach,” IEEE Transactions on geoscience and remote sensing, vol. 32, no. 4, pp. 779–785, 1994.
-  M. E. Winter, “N-findr: An algorithm for fast autonomous spectral end-member determination in hyperspectral data,” in Imaging Spectrometry V, vol. 3753. International Society for Optics and Photonics, 1999, pp. 266–276.
-  J. M. P. Nascimento and J. M. Bioucas-Dias, “Vertex component analysis: A fast algorithm to unmix hyperspectral data,” IEEE transactions on Geoscience and Remote Sensing, vol. 43, no. 4, pp. 898–910, 2005.
-  R. Neville, K. Staenz, T. Szeredi, J. Lefebvre, P. Hauff, J. Lefebvre, and R. Neville, “Automatic endmember extraction from hyperspectral data for mineral exploration,” in Proc. Int. Airborne Remote Sensing Conf & Exhib., Canadian Symp. on Remote Sensing, 1999.
-  F. Van Der Meer, “Iterative spectral unmixing (isu),” International Journal of Remote Sensing, vol. 20, no. 17, pp. 3431–3436, 1999.
-  A. Song, A. Chang, C. Jaewan, C. Seokkeun, and K. Yongil, “Automatic extraction of optimal endmembers from airborne hyperspectral imagery using iterative error analysis (iea) and spectral discrimination measurements.” Sensors, vol. 15, no. 2, pp. 2593 – 2613, 2015.
-  A. Plaza and C.-I. Chang, “Impact of initialization on design of endmember extraction algorithms,” IEEE Transactions on Geoscience and Remote Sensing, vol. 44, no. 11, pp. 3397–3407, 2006.
-  M. Berman, H. Kiiveri, R. Lagerstrom, A. Ernst, R. Dunne, and J. F. Huntington, “Ice: A statistical approach to identifying endmembers in hyperspectral images,” IEEE transactions on Geoscience and Remote Sensing, vol. 42, no. 10, pp. 2085–2095, 2004.
-  A. Zare and P. Gader, “Sparsity promoting iterated constrained endmember detection in hyperspectral imagery,” IEEE Geoscience and Remote Sensing Letters, vol. 4, no. 3, pp. 446–450, 2007.
-  F. van der Meer and X. Jia, “Collinearity and orthogonality of endmembers in linear spectral unmixing,” International Journal of Applied Earth Observation and Geoinformation, vol. 18, pp. 491–503, 2012.
-  J. M. Bioucas-Dias and J. M. Nascimento, “Hyperspectral subspace identification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 46, no. 8, pp. 2435–2445, 2008.
-  [Online]. Available: http://www.lx.it.pt/~bioucas/code/cuprite_ref.zip
-  “Hyperspectral remote sensing scenes.” [Online]. Available: http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes
-  I. Gerg and D. Kun, “Matlab hyper spectral toolbox.” [Online]. Available: https://github.com/davidkun/HyperSpectralToolbox
-  A. Plaza, G. Martín, J. Plaza, M. Zortea, and S. Sánchez, “Recent developments in endmember extraction and spectral unmixing,” in Optical Remote Sensing. Springer, 2011, pp. 235–267.
-  “Matlab toolbox for linear unmixing with the interior point least squares algorithm.” [Online]. Available: https://www.researchgate.net/publication/268743204_Matlab_toolbox_for_linear_unmixing_with_the_Interior_Point_Least_Squares_algorithm
-  E. Chouzenoux, M. Legendre, S. Moussaoui, and J. Idier, “Fast constrained least squares spectral unmixing using primal-dual interior-point optimization,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 7, no. 1, pp. 59–69, 2014.