Band selection with Higher Order Multivariate Cumulants for small target detection in hyperspectral images

by   Przemysław Głomb, et al.

In the small target detection problem a pattern to be located is on the order of magnitude less numerous than other patterns present in the dataset. This applies both to the case of supervised detection, where the known template is expected to match in just a few areas and unsupervised anomaly detection, as anomalies are rare by definition. This problem is frequently related to the imaging applications, i.e. detection within the scene acquired by a camera. To maximize available data about the scene, hyperspectral cameras are used; at each pixel, they record spectral data in hundreds of narrow bands. The typical feature of hyperspectral imaging is that characteristic properties of target materials are visible in the small number of bands, where light of certain wavelength interacts with characteristic molecules. A target-independent band selection method based on statistical principles is a versatile tool for solving this problem in different practical applications. Combination of a regular background and a rare standing out anomaly will produce a distortion in the joint distribution of hyperspectral pixels. Higher Order Cumulants Tensors are a natural `window' into this distribution, allowing to measure properties and suggest candidate bands for removal. While there have been attempts at producing band selection algorithms based on the 3 rd cumulant's tensor i.e. the joint skewness, the literature lacks a systematic analysis of how the order of the cumulant tensor used affects effectiveness of band selection in detection applications. In this paper we present an analysis of a general algorithm for band selection based on higher order cumulants. We discuss its usability related to the observed breaking points in performance, depending both on method order and the desired number of bands. Finally we perform experiments and evaluate these methods in a hyperspectral detection scenario.



page 12


Unsupervised Band Selection of Hyperspectral Images via Multi-dictionary Sparse Representation

Hyperspectral images have far more spectral bands than ordinary multispe...

Hyperspectral Band Selection for Multispectral Image Classification with Convolutional Networks

In recent years, Hyperspectral Imaging (HSI) has become a powerful sourc...

An automatic bad band preremoval algorithm for hyperspectral imagery

For most hyperspectral remote sensing applications, removing bad bands, ...

Spectral band selection for vegetation properties retrieval using Gaussian processes regression

With current and upcoming imaging spectrometers, automated band analysis...

Band selection in RKHS for fast nonlinear unmixing of hyperspectral images

The profusion of spectral bands generated by the acquisition process of ...

BS-Nets: An End-to-End Framework For Band Selection of Hyperspectral Image

Hyperspectral image (HSI) consists of hundreds of continuous narrow band...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Hyperspectral imaging (HSI) systems simultaneously capture hundreds of narrow spectral channels, usually in the Visual-Near Infrared (VNIR, 400-1000nm) or Short Wave Infrared (SWIR, 1000-2500nm) regions of the electromagnetic spectrum. At each pixel the hyperspectral camera produces an approximation of a continuous spectrum. This data carries information about materials presents in the scene [14], as individual intensities result from interaction of light photons of given energy with different molecules. HSI has many potential practical uses including remote sensing of vegetation [42], aiding in art conservation [15], cultural heritage analysis [6], forgery detection [36] or gunpowder residue detection [16].

Among many hyperspectral applications [2], a promising one is the target detection. Classic approaches to hyperspectral target detection [32] include derivations of RX or SVDD detectors for unsupervised case and spectral matched filter (SMF) or subspace projections in the supervised case. In the latter case, the methods based on spectral angle (e.g. Spectral Angle Mapper) are also widely used and effective [34]

. These methods are often supported by algorithms from many domains, e.g., basic detectors are often used as a part of a more complicated algorithm that includes other Machine Learning approaches

[39], dedicated preprocessing and data-window schemes [25]. Another group of methods is based on transforming the data representation, either by modelling the pixel-neighbours relation [26]

or by a sparse vector referring to the background spectra dictionary

[47]. Hyperspectral band selection is also a common extension to existing methods, applied as preprocessing before classic detectors [40].

While hyperspectral images are rich in features, their processing is challenging because of the volume of each image (usually thousands times larger than a corresponding RGB image) and the difficulty of obtaining ground truth data, which requires expert knowledge and sometimes on-site verification[14]. Due to huge volume and high correlation among the neighbouring bands, dimensionality reduction is often applied as a preprocessing step to discard the redundant information in HSI data [19]

. A common approach is to apply feature extraction methods e.g. the Principal Components Analysis (PCA)

[19] in order to transform and reduce the dimensionality of the data. However, obtained features are not directly related to the original spectral channels, therefore an alternative approach is band selection [18], which selects a subset of spectral channels that well represent the image according to some criterion. The band selection has an advantage of preserving original information about the spectrum [4]. This is important e.g. in mineral analysis [3] where spectral signatures of minerals are expected to be associated with certain parts of the spectrum.

Band selection is an important component of many HSI processing methods, not limited to detection. Depending on whether the training data is available, band selection methods can be divided into supervised approaches that select bands based on training examples [27] and unsupervised methods [45]. While supervised methods can obtain more discriminative features, dependence on training examples may lead to instability of the solution, therefore unsupervised band selection may be more robust [18]. Unsupervised band selection approaches can be divided into clustering-based [29][45] and ranking based [4][18]. Clustering-based methods partition bands into disjoint groups (clusters) such that bands in the same cluster are similar to each other and dissimilar to the rest, then the most representative band from each cluster is selected [45]. One particular motivation for this approach may be to overcome the complexity in calculating the joint distribution in high-dimensional spaces [29]. Ranking-based methods assess the importance of each band according to a certain criterion, such as non-Gaussianity [4] or volume-gradient [13]. Recently, methods that try to combine ranking-based and clustering approaches were also proposed [18].

Modelling the data distribution with multivariate Gaussian distribution has many successful application in pattern recognition, e.g., face 

[43] and gesture [11]

classification or feature selection 

[38]. Hence a number of existing classification and target detection algorithms are based on the multivariate Gaussian model, however in many cases this model does not represent the statistical behaviour of hyperspectral data [1]. This motivates the use of non-Gaussian model for HSI data processing [20]. It is also well-known that data can have a non-Gaussian joint distribution despite Gaussian marginals [9]. However, while joint distribution is important for HSI analysis, it is difficult to estimate [29], therefore approaches based on copulas are employed instead [46].

In our work we use cumulants of multivariate data for band selection. Those cumulants, as discussed in the introduction of [10], are a ‘window’ to the joint distribution of multivariate data. The second cumulant of multivariate data is simply its covariance matrix. It can be used in a feature selection procedure e.g. by recurrently removing least informative features in such way that the volume of the information hyper-ellipsoid of the covariance matrix of remaining features is maximised. This leads to the Maximum Ellipsoid Volume (MEV) feature selection [38]

. However, since the MEV is based on the multivariate Gaussian distribution of data, parametrised by the covariance matrix, it is not sensitive on non-Gaussian features such as skewness or kurtosis 

[13]. Higher Order Multivariate Cumulants (HOMC) are represented in our work by tensors of order (-dimensional arrays) [23]. These tensors have an important property: when data has a multivariate Gaussian distribution, every element of a cumulant tensor of order equals zero [21, 28]. This property can be utilised when searching for non-Gaussian distributed bands, and leads to creation of the Joint Skewness Band Selection (JSBS) [12]

procedure, where a Higher Order Singular Value Decomposition (HOSVD) 

[7] of a -order cumulant’s tensor is used to create a measure of non-Gaussianity and select the most informative bands.

However, we have observed that the higher the cumulant’s tensor order is, the more it is sensitive to tails in multivariate distribution, where outliers may appear. This is due to the fact, that while estimating th order cumulant we have terms proportional to the difference of the sample from the mean raised to the th power. This problem was discussed in [31] where multivariate financial data analysis is performed and this type of outlier is associated with an anomalous situation (e.g. a financial crisis). The observation regarding the effect of HOMC motivates a proposal of a family of methods in which a -order cumulant’s tensor is used to extract information about the non-Gaussian joint distribution of features from hyperspectral data. In addition, we also propose a general method of normalisation that reproduces the module of asymmetry or kurtosis in an univariate case. Further as discussed in [12] for the particular rd cumulant case such normalisation reduces cross-correlation between chosen features. This effect have appeared as well in a general th cumulant case during our analysis.

In this paper we perform an experimental evaluation of band selection methods based on HOMC for small target detection in hyperspectral images. Specifically, we make the following contributions:

  1. We apply Joint Kurtosis Feature Selection (JKFS) [9] to the problem of band selection and show that it can be effectively applied for hyperspectral small target detection.

  2. We introduce a new method of band selection, called Joint Hyper Skewness Feature Selection (JHSFS) that is the extension of JKFS. We discuss its properties and show that in some hyperspectral detection scenarios, the proposed method can outperform both JSBS and JKFS.

  3. We propose a uniform derivation of -order cumulant-based band selection methods, that derives JSBS (order ), JKFS (order ) and JHSFS (order ) as special cases and can be extended to orders . We also discuss both advantages and disadvantages of HOMC-based methods.

  4. We present a comparison of performance evaluation for cumulant-based methods on real-life hyperspectral data. In addition, we make an important observation that the performance of band selection methods based on HOMC depends on both the method’s order and the desired dimensionality of the feature space (i.e., the number of selected bands) and can drop sharply for small number of bands. Hence every HOMC-based method of band selection has a minimum required number of selected bands. The existence of this phenomena, called ‘breaking points’, is shown in our experimental evaluation. We propose an explanation by providing a hypothesis about its relation to the number of off-diagonal elements in -order tensor.

2 Band selection using Higher Order Multivariate Cumulants

In this section we focus on the formal introduction of feature selection based on Higher Order Multivariate Cumulants (HOMC). As stated in the introduction, we focus on the information carried by non-Gaussian features.

2.1 Preliminaries

We use the bold uppercase notation for a matrix (e.g. ) and an uppercase for its column vector (e.g. the th vector of matrix is ). We use the lowercase notation for an element of a matrix or a tensor (e.g. ). We introduce tensors with the following definition.

Definition 2.1.

Following [23], a -mode tensor


is a multidimensional array with elements indexed by the multi-index , where .

Following the Chapter in [23], a tensor unfolding or matricisation is a transformation of a tensor into a matrix by integrating modes of the tensor into one mode of the new matrix, and is formally defined as follows:

Definition 2.2.

Following [23], the -mode unfolding of a tensor into a matrix


is defined element-wisely as:




2.2 HOMC for hyperspectral data

The hyperspectral image data can be represented in the form of a -mode tensor:


where first two modes correspond to spatial dimensions of pixels while last mode correspond to spectral channels. In this paper we are searching for spectral channels that carry relevant information about small targets i.e. they are present in small fraction of pixels. Following [12] we consider spectral channels as marginals of a multivariate variable and reflectance values recorded at each given pixel as a realisation of such variable. Since we are analysing multivariate statistics of data, we are ignoring spatial information in data (pixel positions). Therefore for clarity we unfold hyper-spectral data, in accordance with the Definition 2.2, into the matrix form, where rows indicate realisations (hyperspectral pixels) and columns indicate marginals (spectral channels), as:


where . Consistently with notation introduced in [10], a column vector of all realisations for th spectral channel is . A conceptually simple approach to search for a sub-set of non-Gaussian marginals would be to test each univariate , indexed by , for normality. However as discussed in [12], such approach is oversimplified since it does not take into account multi-variate non-Gaussianity. This can be explained using a reference to a copula approach [33], because data can have Gaussian marginal distributions and non-Gaussian copula (non-Gaussian cross correlation between marginals). This problem is described in detail in  [9].

A HOMC can be represented by super-symmetric tensors [37], defined as follows:

Definition 2.3.

The -mode tensor of size is super-symmetric, if values of its elements are invariant under any permutation inside its multi-index i.e.


Referring to [10] we denote the super-symmetric tensor as .

Remark 2.1.

For the super-symmetric tensor all unfoldings (see the Definition 2.2) are equivalent, i.e.


since a multi-index of a super-symmetric tensor can be permutated without changing a value of indexed element.

Definition 2.4.

Given data matrix with elements where indexes realisations (hyperspectral pixels) and marginals (spectral channels), we define the following estimators of expected values [10]:


Analogically centred expected values [10] are defined as:


We assume that the number of realisations is relatively large, therefore the bias of the estimator is negligible. In addition, we note that expected values are invariant to any permutation of realisations as long as it is performed simultaneously for all marginals.

Referring to [10], a th order cumulant of -variate data is a super-symmetric tensor and in particular:

  1. with elements ,

  2. with elements ,

  3. with elements ,

  4. with elements

  5. with elements


Formulas for cumulants of order are out of scope of this paper, for their definition please refer to to [10].

Remark 2.2.

Let be the cumulant’s tensor of order with elements . Such tensor can be analysed by dividing it into the following disjoint areas:

  1. diagonal elements , where all elements of the multi index are the same, i.e., , such area corresponds to univariate cumulants of marginals (wavelength in our case) numerate by ;

  2. off diagonal elements , where all multi index elements are distinct i.e., ;

  3. partially diagonal elements that are neither diagonal nor off-diagonal i.e., .

We make an important observation that off-diagonal elements of -order cumulant’s tensor are tied to a mutual cross-correlation of marginals (spectral channels), therefore they give a new type of information for hyperspectral data analysis. As an example, consider the th cumulant. If its off diagonal elements fulfil then according to the Equation (11), the th

central moment

can be expressed in terms of covariances (nd central moments) and carries no additional information. Hence considering mutual cross-correlation of marginals would give no additional information. This is not the case if , since we obtain additional information from mutual cross-correlation of marginals.

In a case of the rd cumulant, off diagonal elements of measure a specific type of mutual asymmetry of distinct marginals. Analogically in the case of the th cumulant (see Equation (12)) off diagonal elements are much more complicated. Nevertheless one can argue that they measure information that is not measured by nd and rd cumulants.

2.3 Applications of HOMC to band selection

Recall the standard Maximum Ellipsoid Volume (MEV) method of feature selection [38]

where at each iteration step, one marginal variable is removed in such a way that the determinant of the covariance matrix (second cumulant) is maximised. Such determinant is a product of eigenvalues, hence proportional to the volume of the information ellipsoid. Inspired by this approach, authors of

[12] introduced a method called JSBS (Joint Skewness Band Selection). They argued that by analogy, the determinant of the following matrix:


measures the information extracted by the rd cumulant tensor – . Note that the Eq. (13) uses the first mode unfolding, because by the Remark 2.1, the super-symmetric tensor unfolding is mode invariant. Based on this assumption, they introduced the target function:


Band selection is then performed, analogically to MEV algorithm, by iteratively removing marginal variables in such way that in every iteration a target function is maximised. The denominator of the Eq. (14) is a normalisation that, according to [12], reduces the risk of selecting highly correlated marginals.

However, as mentioned in the introduction, we observed that the higher the cumulant’s tensor order is, the more it is sensitive to tails in multivariate distribution, where outliers may appear. In addition, there exist datasets for which th cumulant tensor generalisation, called Joint Kurtosis Feature Selection (JKFS)  [9] is more effective that JSBS. This motivates an introduction of a general, -cumulant based method.

We define a -order dependency matrix as


We use this matrix to define the following target function:


In the case of our methods simplifies to JSBS, while in the case of it simplifies to JKFS. As a th cumulant is called hyper-skewness, in the case of we call our method the Joint Hyper Skewness Feature Selection (JHSFS).

The power term in the denominator provides the scale-invariant normalisation. Suppose we have data and rescale all realisations of all marginals by a factor  i.e.,  Referring to the general formula for the th order multivariate cumulant in [10], and Equation (15), we have:


Another argument for such normalisation is that if is a single column of realisations of one marginal, it is easy to show that would be an absolute value of its asymmetry while would be an absolute value of its kurtosis. Finally, as argued in [12]

, such normalisation can decrease the probability of choosing two highly cross correlated features.

While the use of HOMC is sensitive to outliers located in tails of multi-variate distribution, it comes at a cost of higher estimation error. The theoretical discussion regarding this estimation errors is provided in the Appendix  in [10].

2.4 Generalised band selection with HOMC

To define a generalised band selection with HOMC we first introduce a fiber cut of the super-symmetric tensor in the Definition 2.5. The proposed band selection method is then presented in the form of the Algorithm 1.

Definition 2.5.

Let be the super-symmetric tensor. Following [12], we define its th fibres cut, by the following tensor , where:


and . Using notation from [23] we remove all th fibres of . We note that such transformation preserves super-symmetry.

1:Input: covariance , cumulant tensor , target function , retained number of bands
2:Output: a subset (index) of bands that carry meaningful information.
3:function features select()
4:     for  to  do
5:         for  do
7:         end for
8:         set           remove band
11:     end for
12:     return remaining bands
13:end function
Algorithm 1 Generalised, HOMC-based band selection algorithm

2.5 Theoretical limits for band selection with HOMC.

Band selection procedure presented in the Algorithm 1 is parametrised by a stop condition indicating the number of retained features. It can be argued that this stop condition parameter coincides with the used cumulants’ order parameter . For the typical experimental setting we start with features.

The features elimination method we introduced first discards Gaussian distributed spectral channels that contain either a pure noise or a background Gaussian distributed marginals. For the subset of such marginals all HOMC are zero. At this stage our elimination method removes the same bands regardless of cumulant’s order.

Figure 1: The ratio of off diagonal to all elements in a -mode cumulant’s tensor as a function of feature vector size . The dashed line marks the ratio, below which the impact of off diagonal elements is small (see Section 2.5).

Recalling Remark 2.2 there are three areas of the cumulant’s tensor: diagonal, partially diagonal and off diagonal one. Following discussion in Section 2.2 we are particularly interested in an off diagonal one since there an information about mutual cross-correlation of marginals may appear. This makes cumulants of higher order especially interesting. The fraction of off diagonal elements in is presented in Figure 1. This fraction drops sharply both with and . As a rule of thumb, we approximate that the impact of the off diagonal area in an algorithm is small if its size (off diagonal elements numbers) is smaller that of the tensor size. Referring to the Figure 1 it appears if for , for , for and for . Hence we expect those to be lower limits of features selection usability.

Consider at this point, that for large we may have large estimation error of HOMC especially dealing with the moderate sample size i.e., . Recall that an estimation error of high order statistics grows rapidly with their order , see Appendix in [10]. For this reason despite good detectability of the JSBS [12] and the JKFS [9] for such datasets we may expect reduced performance for orders .

2.6 Hyperspectral small target detection with HOMC

The detection proceeds in two stages. First the band selection is performed using the Algorithm 1 or the MEV, hence reducing the image to  image, then a detector is applied to the selected bands to mark discovered targets.

For the detector we use the Spectral Angle Mapper (SAM) [24]. The SAM similarity score is computed between the elements of data matrix and the target signature matrix with the elements by applying the equation


This detector has a number of advantages: it is simple to compute and has a straightforward explanation (it is sensitive to the angle between -dimensional spectral vector, robust to e.g. illumination changes); it does not require to estimate additional parameters (besides a detection threshold); it does not require assumptions on data distributions (e.g. a normality assumption, present in some methods, which is rarely satisfied for real-life hyperspectral dataset); it has been well examined in many case studies, e.g. [30, 44, 17, 35].

The detection threshold, that is the only parameter of the detector, is required for evaluation of the boolean expression which decides whether a given spectral vector

is classified as a target or not. Given a detector described by the Receiver Operating Characteristic (ROC) curve, the parameter

controls the balance between expected true and false positive rates. This value can be optimized e.g. by cross-validation or by analysis of distribution. We note that for comparison of the effect of different orders this parameter is not required, as whole ROC curves can be used for this task.

3 Experimental results and discussion

This section presents an experimental evaluation of band selection methods based on Higher Order Multivariate Cumulants (HOMC) discussed in previous sections. The experiments are performed for three methods of band selection based on HOMC: JSBS (order ), JKFS (order ), and JHSFS (order ). As a reference, detection results for band selection with Maximum Ellipsoid Volume (MEV), and without band selection are also provided. Results of experiments are presented in the form of Receiver Operating Characteristic (ROC) curves and the performance of the detector is measured using the Area Under Curve (AUC) measure. Each ROC curve represents the whole range of the detector parameter , for a set number of retained bands , selected with the Algorithm 1.

3.1 Hyperspectral dataset

(a) The Cuprite image (visualized with bands , and as R, G, B to enhance the surface mineral composition)
(b) Buddigntonite ground truth mask based on the maps from [41]
Figure 2: Cuprite hyperspectral dataset and the ground truth used in detection experiments.

In order to present the detailed examination of the performance of the proposed method, we use the Cuprite111Available online at hyperspectral dataset. The image presents a mining area around Cuprite in Las Vegas, NV, U.S. The site was imaged with AVIRIS sensor and has 224 spectral channels in range 370 nm to 2480 nm. In conformance with a standard procedure, noisy and water absorption bands 1–3, 104–113, 148–167, and 221–224 were removed from the image. To reduce computational complexity, we consider last 50 bands that contain the spectral range of interest [8, 5]. The site has been a subject of many experiments, and its geology was mapped in detail [41].

In order to compare our results with these provided in [12], we focus on Buddigntonite deposit detection, which in Cuprite image has known local surface presence around the area nicknamed the ‘Buddigntonite bump’. Based on the [41], a ground truth map was prepared (see Figure 2). To achieve independence of the target spectrum from the image, we use the corresponding reference spectrum222s07_AV97_Buddingtnt+Na-Mont_CU93-260B_NIC4b_RREF from the USGS Spectral Library [22].

Figure 3: Results of Buddigntonite deposit detection in the Curprite dataset compared to the scenario when no band selection is used (black line). Panel (a) presents an Area Under Curve (AUC) as a function of the number of selected bands for three methods of band selection based on HOMC (JSBS, JKFS, JHSFS) and for band selection with the MEV algorithm. Beginning with values of bands left, detection with HOMC-based methods outperforms the detection without band selection. The magnification of the vertical axis for this area is presented in the Panel (b). Results illustrate the presence of ‘breaking points’, discussed in the Section 2.5.
Figure 4: An illustration of breakdowns in accuracy for band selection methods based on HOMC when the desired number of bands becomes too low. Panel (a) presents the scenario when the number of selected band is sufficient for all methods. Panel (b) presents the breakdown in accuracy for JHSFS method (order ). Panel (c) presents the breakdown in accuracy for JKFS method (order ). Panel (d) presents the breakdown in accuracy for JSBS method (order ). Observe that each ROC curve loses its convexity at the breakdown point.
Figure 5: An illustration of superior performance of the JSHFS if very low false positive probability is required.

3.2 Results and discussion

The performance of target detector using evaluated methods of band selection is presented in the Figure 3. The detector using HOMC-based methods outperforms the detector based on MEV and for the number of bands greater then , it also usually outperforms the detector that does not use band selection. Note that if a band selection method achieves the same score as detector applied on all bands it can already be viewed as a success, as we reduce the amount of working data. The improved result is thus an additional advantage.

From an application point of view, the best performance is achieved by the (JKFS) method proposed in this work with ; it achieves highest score at with the lowest number of retained bands, thus maximizing the detection performance while minimizing the data volume. As discussed in the Section 2.5, for large values of the parameter all methods behave in a similar fashion. However, for small values of the parameter we can observe sharp breakdowns in the performance of each method: at bands for the order (JSBS), at bands for the order (JKFS) and at bands for the order (JHSFS). A more detailed illustration of these breakdowns is provided in Fig. 4. Note that experimental results are consistent with the theoretical lower limit of the parameter, discussed in Section 2.5. As presented in the Fig. 5, when the desired level of false positive rate (FPR) is low, the JHSFS algorithm significantly outperforms other methods, which corresponds to the assumption that it is more sensitive to outliers.

Methods of band selection based on HOMC are a promising approach to hyperspectral small target detection. Our results show that they allow to select a small subset of relevant bands that maintains or improves the performance of the detector while reducing the dimensionality of the data by up to . They also significantly outperform standard methods of band selection such as MEV.

An interesting observation regards the performance drop of evaluated methods, when the number of selected bands is too low (the existence of ‘breaking points’). This means that HOMC-based methods have limits to achievable dimensionality reduction. These breakdowns in performance correspond to a phase transition-like behaviour, as presented in the Figure 

4, which corresponds to the changes in shape of the ROC curve from convex to non-convex.

Results presented in the Figure 3, Panel (b) indicate that while HOMC-based methods can achieve superior performance for small number of bands, some variability of performance can be observed. This may be caused by estimation error of higher order statistics. However, we observe that the performance of a reference approach using MEV is unstable in a significantly larger range.

As presented in the Fig. 5, HOMC are especially effective when the desired level of false positive rate (FPR) is low. While this sensitivity seems to be correlated with the cumulant order, we decided to evaluate only cumulants up to the th order. Since the theoretical lower limit of the number of selected bands for the methods based on th order cumulant is approximately , and the estimation error is higher than for JSHFS. We expect that improvement for HOMC can be significant for larger hyperspectral images with very high number of pixels. This requires only a large amount of collected data (which can be performed automatically, e.g. with satellite passes) and does not require any manual annotation, which can be an expensive process. What is also important, computation of HOMC for large datasets can be effectively implemented [10], which gives the potential to implement HOMC-based band selection methods for very large images.

4 Conclusions

Band selection based on Higher Order Multivariate Cumulants (HOMC) seems to be an effective and promising approach for small target detection in hyperspectral images. We have observed that HOMC are more sensitive to tails in multivariate distribution. In our opinion, this property corresponds has a major impact on their superior performance. On the other hand, HOMC-based methods may have high estimation error, which may cause some instability when the number of desired bands is low.

In this work we proposed a uniform derivation of -order cumulant-based methods. Methods such as JSBS and JKFS are special cases of the proposed derivation. We also proposed a new method of band selection, called Joint Hyper Skewness Feature Selection (JHSFS). We have shown that in scenarios when the desired False Positive Rate (FPR) is low, the JHSFS can outperform both the JSBS and JKFS.

During the experimental evaluation we have observed that the performance of hyperspectral target detection using HOMC-based band selection drops sharply when the desired number of bands is too low. We noticed that these ‘breaking points’ are correlated with the percentage of off diagonal elements in a -mode cumulant tensor. Hence each of HOMC-based methods has a minimum required number of selected bands. We provide a theoretical justification for this fact, by observing that off diagonal elements of -order cumulant’s tensor measure a mutual correlation of th feature.


This work was partially supported by the National Science Centre, Poland— project number 2014/15/B/ST6/05204. Authors would like to thank Adam Glos for his assistance in implementation of the described method.


  • [1] Nicola Acito, Giovanni Corsini, and Marco Diani. Statistical analysis of hyper-spectral data: A non-gaussian approach. EURASIP Journal on Applied Signal Processing, 2007(1):13–13, 2007.
  • [2] José M Bioucas-Dias, Antonio Plaza, Gustavo Camps-Valls, Paul Scheunders, Nasser Nasrabadi, and Jocelyn Chanussot. Hyperspectral remote sensing data analysis and future challenges. IEEE Geoscience and remote sensing magazine, 1(2):6–36, 2013.
  • [3] Adrian J Brown, Brad Sutter, and Stephen Dunagan. The marte vnir imaging spectrometer experiment: Design and analysis. Astrobiology, 8(5):1001–1011, 2008.
  • [4] Chein-I Chang and Su Wang. Constrained band selection for hyperspectral imagery. IEEE Transactions on Geoscience and Remote Sensing, 44(6):1575–1585, 2006.
  • [5] Roger N. Clark, Gregg A. Swayze, K. Eric Livo, Raymond F. Kokaly, Steve J. Sutley, J. Brad Dalton, Robert R. McDougal, and Carol A. Gent. Imaging spectroscopy: Earth and planetary remote sensing with the usgs tetracorder and expert systems. Journal of Geophysical Research: Planets, 108(E12), 2003.
  • [6] Costanza Cucci, John K Delaney, and Marcello Picollo. Reflectance hyperspectral imaging for investigation of works of art: old master paintings and illuminated manuscripts. Accounts of chemical research, 49(10):2070–2079, 2016.
  • [7] Lieven De Lathauwer, Bart De Moor, and Joos Vandewalle. A multilinear singular value decomposition. SIAM journal on Matrix Analysis and Applications, 21(4):1253–1278, 2000.
  • [8] N. Dobigeon, S. Moussaoui, M. Coulon, J. Tourneret, and A. O. Hero. Joint bayesian endmember extraction and linear unmixing for hyperspectral imagery. IEEE Transactions on Signal Processing, 57(11):4355–4368, 2009.
  • [9] Krzysztof Domino. The use of fourth order cumulant tensors to detect outlier features modelled by a t-Student copula. arXiv preprint arXiv:1804.00541, 2018.
  • [10] Krzysztof Domino, Łukasz Pawela, and Piotr Gawron. Efficient computation of higer-order cumulant tensors. SIAM J. SCI. COMPUT., 40(3):A1590–A1610, 2018.
  • [11] Piotr Gawron, Przemysław Głomb, Jarosław A. Miszczak, and Zbigniew Puchała. Eigengestures for natural human computer interface. In Tadeusz Czachórski, Stanisław Kozielski, and Urszula Stańczyk, editors, Man-Machine Interactions 2, pages 49–56. Springer Berlin Heidelberg, 2011.
  • [12] Xiurui Geng, Kang Sun, Luyan Ji, Hairong Tang, and Yongchao Zhao. Joint skewness and its application in unsupervised band selection for small target detection. Scientific reports, 5:9915, 2015.
  • [13] Xiurui Geng, Kang Sun, Luyan Ji, and Yongchao Zhao. A fast volume-gradient-based band selection method for hyperspectral image. IEEE Transactions on Geoscience and Remote Sensing, 52(11):7111–7119, 2014.
  • [14] Pedram Ghamisi, Naoto Yokoya, Jun Li, Wenzhi Liao, Sicong Liu, Javier Plaza, Behnood Rasti, and Antonio Plaza. Advances in hyperspectral image and signal processing: A comprehensive overview of the state of the art. IEEE Geoscience and Remote Sensing Magazine, 5(4):37–78, 2017.
  • [15] Bartosz Grabowski, Wojciech Masarczyk, Przemysław Głomb, and Agata Mendys. Automatic pigment identification from hyperspectral data. Journal of Cultural Heritage, 31:1–12, 2018.
  • [16] Przemysław Głomb, Michał Romaszewski, Michał Cholewa, and Krzysztof Domino. Application of hyperspectral imaging and machine learning methods for thedetection of gunshot residue patterns. Forensic Science International, 2018.
  • [17] E. L. Hunter and C. H. Power. An assessment of two classification methods for mapping thames estuary intertidal habitats using casi data. International Journal of Remote Sensing, 23(15):2989–3008, 2002.
  • [18] Sen Jia, Guihua Tang, Jiasong Zhu, and Qingquan Li. A novel ranking-based clustering approach for hyperspectral band selection. IEEE Transactions on Geoscience and Remote Sensing, 54(1):88–102, 2016.
  • [19] L. O. Jimenez-Rodriguez, E. Arzuaga-Cruz, and M. Velez-Reyes.

    Unsupervised linear feature-extraction methods and their effects in the classification of high-dimensional data.

    IEEE Transactions on Geoscience and Remote Sensing, 45(2):469–483, Feb 2007.
  • [20] Vedran Kajić, Boris Považay, Boris Hermann, Bernd Hofer, David Marshall, Paul L Rosin, and Wolfgang Drexler. Robust segmentation of intraretinal layers in the normal human fovea using a novel statistical model based on texture and shape analysis. Optics express, 18(14):14730–14744, 2010.
  • [21] Maurice George Kendall et al. The advanced theory of statistics. The advanced theory of statistics., 1946.
  • [22] R.F. Kokaly, R.N. Clark, G.A. Swayze, K.E. Livo, T.M. Hoefen, N.C. Pearson, R.A. Wise, W.M. Benzel, H.A. Lowers, R.L. Driscoll, and A.J. Klein. USGS spectral library version 7. Technical report, U.S. Geological Survey Data Series 1035, 2017.
  • [23] Tamara G Kolda and Brett W Bader. Tensor decompositions and applications. SIAM review, 51(3):455–500, 2009.
  • [24] F.A. Kruse, A.B. Lefkoff, J.W. Boardman, K.B. Heidebrecht, A.T. Shapiro, P.J. Barloon, and A.F.H. Goetz. The spectral image processing system (sips)—interactive visualization and analysis of imaging spectrometer data. Remote Sensing of Environment, 44(2):145 – 163, 1993. Airbone Imaging Spectrometry.
  • [25] Heesung Kwon, Sandor Z Der, and Nasser M Nasrabadi. Adaptive anomaly detection using subspace separation for hyperspectral imagery. Optical Engineering, 42(11):3342–3352, 2003.
  • [26] Wei Li and Qian Du. Collaborative representation for hyperspectral anomaly detection. IEEE Transactions on geoscience and remote sensing, 53(3):1463–1474, 2015.
  • [27] Wei Li, Saurabh Prasad, James E Fowler, and Lori Mann Bruce. Locality-preserving dimensionality reduction and classification for hyperspectral image analysis. IEEE Transactions on Geoscience and Remote Sensing, 50(4):1185–1198, 2012.
  • [28] Eugene Lukacs. Characteristic functions. Charles Griffin and Co., Ltd., London, 1970.
  • [29] Adolfo Martínez-Usó, Filiberto Pla, José Martínez Sotoca, and Pedro García-Sevilla. Clustering-based hyperspectral band selection using information measures. IEEE Transactions on Geoscience and Remote Sensing, 45(12):4158–4171, 2007.
  • [30] F. Van Der Meer, M. Vazquez-Torres, and P. M. Van Dijk. Spectral characterization of ophiolite lithologies in the troodos ophiolite complex of cyprus and its potential in prospecting for massive sulphide deposits. International Journal of Remote Sensing, 18(6):1245–1257, 1997.
  • [31] J-F. Muzy, D. Sornette, J. Delour, and A. Arneodo. Multifractal returns and hierarchical portfolio theory. Quantitative Finance, 1(1):131–148, 2001.
  • [32] Nasser M Nasrabadi. Hyperspectral target detection: An overview of current and future challenges. IEEE Signal Processing Magazine, 31(1):34–44, 2014.
  • [33] Roger B Nelsen. An introduction to copulas. Springer Science & Business Media, 2007.
  • [34] Bruce W Pengra, Carol A Johnston, and Thomas R Loveland. Mapping an invasive plant, phragmites australis, in coastal wetlands using the eo-1 hyperion hyperspectral sensor. Remote Sensing of Environment, 108(1):74–81, 2007.
  • [35] George P. Petropoulos, Krishna Prasad Vadrevu, Gavriil Xanthopoulos, George Karantounias, and Marko Scholze.

    A comparison of spectral angle mapper and artificial neural network classifiers combined with landsat tm imagery analysis for obtaining burnt area mapping.

    Sensors, 10(3):1967–1985, 2010.
  • [36] Oxana Ye Rodionova, Lars P Houmøller, Alexey L Pomerantsev, Paul Geladi, James Burger, Vladimir L Dorofeyev, and Alexander P Arzamastsev. NIR spectrometry for counterfeit drug detection: a feasibility study. Analytica Chimica Acta, 549(1):151–158, 2005.
  • [37] Martin D Schatz, Tze Meng Low, Robert A van de Geijn, and Tamara G Kolda. Exploiting symmetry in tensors for high performance: Multiplication with symmetric tensors. SIAM Journal on Scientific Computing, 36(5):C453–C479, 2014.
  • [38] Charles Sheffield. Selecting band combinations from multispectral data. Photogrammetric Engineering and Remote Sensing, 51:681–687, 1985.
  • [39] Zhenwei Shi, Xinran Yu, Zhiguo Jiang, and Bo Li. Ship detection in high-resolution optical imagery based on anomaly detector and local shape feature. IEEE Transactions on Geoscience and Remote Sensing, 52(8):4511–4523, 2014.
  • [40] Kang Sun, Xiurui Geng, and Luyan Ji. A new sparsity-based band selection method for target detection of hyperspectral image. IEEE Geoscience and Remote Sensing Letters, 12(2):329–333, 2015.
  • [41] G.A. Swayze, R.N. Clark, A.F.H. Goetz, K.E. Livo, G.N. Breit, F.A. Kruse, S.J. Stutley, L.W. Snee, H.A. Lowers, J.L. Post, R.E. Stoffregen, and R.P. Ashley. Mapping advanced argillic alteration at Cuprite, Nevada using imaging spectroscopy. Economic Geology, 109(5):1179–1221, 2014.
  • [42] Prasad S Thenkabail and John G Lyon. Hyperspectral remote sensing of vegetation. CRC Press, 2016.
  • [43] Matthew Turk and Alex Pentland. Eigenfaces for recognition. Journal of cognitive neuroscience, 3(1):71–86, 1991.
  • [44] H. Yao, Z. Hruska, R. Kincaid, A. Ononye, R. L. Brown, and T. E. Cleveland. Spectral angle mapper classification of fluorescence hyperspectral image for aflatoxin contaminated corn. In 2010 2nd Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing, pages 1–4, 2010.
  • [45] Yuan Yuan, Jianzhe Lin, and Qi Wang. Dual-clustering-based hyperspectral band selection by contextual analysis. IEEE Transactions on Geoscience and Remote Sensing, 54(3):1431–1445, 2016.
  • [46] Xuexing Zeng and Tariq S Durrani. Band selection for hyperspectral images using copulas-based mutual information. In Statistical Signal Processing, 2009. SSP’09. IEEE/SP 15th Workshop on, pages 341–344. IEEE, 2009.
  • [47] Yuxiang Zhang, Bo Du, and Liangpei Zhang. A sparse representation-based binary hypothesis model for target detection in hyperspectral images. IEEE Transactions on Geoscience and Remote Sensing, 53(3):1346–1354, 2015.