I Introduction
SINCE hyperspectral imagery consists of tens or hundreds of bands with a very high spectral resolution, it has drawn more attention from various applications in the past decades, such as spectral unmixing[1, 2], classification[3, 4], target detection[5, 6] and so on. However, high spectral dimensionality with strong intraband correlations also results in informantion redundancy and computational burden of data processing[7]
.Therefore, dimensionality reduction (DR) has become one of the most important techniques for addressing these problems. DR can be categorized into two classes: feature extraction and feature selection. In this paper, we mainly focus on the former.
The most commonly used feature extraction algorithm is principal component analysis (PCA)[8]
, which aims to search for the projection direction that maximizes the variance. Its solution corresponds to the eigenvectors of the image’s covariance matrix. Several techniques that originated from PCA have been developed, such as kernel PCA (KPCA)
[9] and maximum noise fraction (MNF)[10]. KPCA is a nonlinear extension of PCA, which transforms the data into a higher dimensional space via a mapping function and then performs the PCA method. MNF is another popular method for feature extraction, which considers the image quality and selects the signaltonoise ratio (SNR) as the measure index.The methods mentioned above mainly focus on the secondorder statistical characteristics of the data. However, the distribution of many real data sets usually does not satisfy the Gaussian distribution. Therefore, these methods may have a poor performance and cannot reveal the intrinsical structure of the data. In this case, many methods based on the higherorder statistics have been paid more attention in recent years and have been applied in many remote sensing fields, including anomaly detection
[11, 12], endmember extraction[13], target detection[14, 15]. Independent component analysis (ICA) is one of the most successful feature extraction techniques. It was derived from the blind source separation (BSS) application, which attempts to find a linear representation of nonGaussian data so that the components are statistically independent, or as independent as possible
[16]. Several widelyused algorithms include joint approximate diagonalization of eigenmatrices (JADE)[17] and Fast Independent Component Analysis (FastICA)[18]. JADE utilizes the fourthorder culumant tensor of the data. Some algorithms that originated from JADE have been developed later. A thirdorder analogue called Skewnessbased ICA via Eigenvectors of Cumulant Operator (EcoICA) is proposed[19]. Other joint diagonalization methods include subspace fitting method (SSF)[20], Distinctively Normalized Joint Diagonalization (DNJD)[21] and Alternating ColumnsDiagonal Centers (ACDC)[22], etc.FastICA can select skewness, negentropy or other indices as the nonGaussian measurement. It can reach a cubic convergence rate and outperform most of the other commonly used ICA algorithms[23]
. However, it requires all the pixels to be involved in each iteration for searching for the optimal projection direction, which is quite timeconsuming, especially for the high dimensional data. To solve this problem, Geng et al. has proposed an efficient method called Principal Skewness Analysis (PSA)
[24]. PSA can be viewed as a thirdorder generalization of PCA. Meanwhile, it is also equivalent to FastICA when selecting the skewness as a nonGaussian index. Following this work, a momentum version (called MPSA) to alliviate the oscillation phenomenon of PSA [25]and a natural fourthorder extension method, i.e., Principal Kurtosis Analysis (PKA)
[26] are also analyzed.The solution of these PSAderived methods can be transformed into the problem of calculating the eigenvalues and the corresponding eigenvectors of the tensor, which is similar to PCA. By adopting the fixedpoint scheme, the solution can be obtained iteratively. To prevent the solution from converging to the same one that has been determined previously, an orthogonal complement operator is also introduced in these methods. Thus all of them can obtain an orthogonal transformation matrix eventually, since the search space of the eigenvector to be determined is always restricted in the orthogonal complement space of the previous one. However, theoretical analysis based on multilinear algebra has shown that the eigenvectors of a supersymmetric tensor are not inherently orthogonal in general [27, 28], which is different from the situation for the real symmetric matrix. Thus the orthogonal constraint in PSA and the inherent nonorthogonality of the eigenvectors of supersymmetric tensor are a pair of irreconcilable contradictions. In this paper, we propose a more relaxed constraint, and based on this, a new algorithm, which is named nonorthogonal principal skewness analysis (NPSA), is presented to deal with this problem. It is expected that NPSA can have the following two attributes: 1) similar to PSA, it can also prevent the eigenvector to be determined from converging to the eigenvectors that have been determined previously; 2) it can obtain a more accurate solution than that of PSA meanwhile.
The rest of this paper is organized as follows. In Section II, we briefly review the original PSA algorithm and analyze its deficiencies with a simple example. In Section III, we present the new strategy in NPSA first and then obtain an improved version when taking the complexity into consideration. Two strategies are compared in the end of this section. In Section IV, we give some theoretical analysis to justify the validity of the algorithm. Some experimental results are given in Section V, and the conclusions are drawn in Section VI.
Ii Background
In this section, we first introduce some notations, basic definitions and important properties used throughout this paper, and then give a brief review of the formations and deficiencies of PSA.
Iia Preliminaries
Following [29], in this paper, the highorder tensors are denoted by boldface Euler script letters, e.g., . A thorder tensor is defined as , where is the order of , also called the way or mode. For =
, it is a vector. For
=, it is a matrix. The element of is denoted by . Fibers, the higherorder analogue of matrix rows and columns, are defined by fixing every index except for one. Slices are twodimensional sections of a tensor, defined by fixing all except for two indices. For a thirdorder tensor , as shown in Fig. 1, its three different slices are called horizontal, lateral and frontal slices, which can be denoted by , respectively. Compactly, the th frontal slice is also denoted as . A tensor is called supersymmetric if its elements remain invariant under any permutation of the indices[29].Some important operations are illustrated as follows: denotes the outer product of two vectors. The operator is to reorder the elements of a matrix or a higherorder tensor into a vector and is the opposite. The mode product of a tensor with a matrix is denoted by , whose element is . The range of a matrix is defined as and its dimensionality is denoted by . denotes the set of all positive integers.
The Kronecker product of two matrices and is a matrix denoted as , which is defined as
(1) 
For simplicity, we use and to denote the times Kronecker Product of the matrix and the vector , respectively.
IiB PSA Algortithm
In PSA, the coskewness tensor, the analogue of the covariance matrix in PCA, is constructed to calculate the skewness of the image in the direction .
Assuming that the image data set is , where is the vector and is the number of pixels. The image should first be centralized and whitened by
(9) 
where is the mean vector and is the whitened image. is called the whitening operator, where represents the eigenvector matrix of the covariance matrix and is the corresponding eigenvalue diagonal matrix.
Then, the coskewness tensor is calculated by
(10) 
Fig. 2 shows a sketch map of the calculation of . Obviously, is a supersymmetric tensor with a size of .
Then the skewness of an image in any direction can be calculated by
(11) 
where is a unit vector, i.e., .
So the optimization model can be formulated as
(12) 
Using the Lagrangian method, the problem is equivalent to solving the equation
(13) 
A fixedpoint method is performed to calculate for each unit, which can be expressed as follows:
(14) 
If it does have a fixedpoint, the solution is called the first principal skewness direction and is the skewness of the image in the direction . Equivalently, is also called the eigenvalue/eigenvector pair of a tensor, introduced by Lim [32] and Qi[33].
To prevent the second eigenvector from converging to the same one as the first, the algorithm projects the data into the orthogonal complement space of , which is equivalent to generate a new tensor by calculating the mode product
(15) 
where is the orthogonal complement projection operator of and is the identity matrix.
Then, the same iteration method, i.e., (14), can be applied to the new tensor to obtain the second eigenvector and the following process is conducted in the same manner.
IiC Deficiencies Of PSA
As mentioned before, an orthogonal complement operator is introduced in PSA in order to prevent the next eigenvector from converging to the eigenvectors that have been determined previously. As is well known, the eigenvectors of a real symmetric matrix is naturally orthogonal to each other. However, this may not hold when generalized to the higherorder cases.
We here present a simple example to illustrate this phenomenon. Consider a supersymmetric tensor , whose two frontal slices are
It can easily verifed that its two eigenvectors are
and their inner product is , which means that they are nonorthogonal. However, the results obtained by the PSA algorithm are
which are orthogonal. It is apparent that deviates from , and they have a angle. The error is caused by the orthogonal constraint used in PSA. Therefore, how to obtain the more accurate eigenpairs is significant.
Iii Npsa
Iiia New Search Strategy
Here, we first give the new search strategy in NPSA and then theoretically illustrate why this method can obtain the more accurate eigenpairs in the next section.
Similar to PSA, the first eigenvector can be obtained according to (14). The subsequent steps are presented as follows:
(1): vectorize the tensor into a vector . Usually, The vectorization of a thirdorder tensor is defined as the vertical arrangement of column vectorization of the frontslice matrix [31], i.e., , where is the vector generated by the th frontal slice , i.e., .
(2): compute a new vector via the times Kronecker product of the vector , denoted by .
(3): compute the orthogonal complement projection matrix of , which can be expressed as
(16) 
where is the 3times Kronecker product of the matrix of (15) and it is a identity matrix.
(4): multiply and the vectorized tensor in step (1) and then perform the operation to obtain a new tensor. Without causing ambiguity and to have a concise form, we still express the updated tensor as . Thus we can have
(17) 
Then we can obtain the second eigenvector by performing the fixedpoint scheme, i.e., (14), to the new updated tensor and similarly repeat the above process until we get all or preset eigenpairs.
For the example mentioned in the previous section, the two eigenvectors obtained by NPSA are
The angle between and is , which is less than that between and , as shown in Fig. 3. It means that the new search strategy presented in NPSA is actually efficient.
IiiB Complexity Reduction
In this subsection, we take the complexity into consideration. It can be observed that although the strategy shown in step (1) step (4) is efficient, there are still some problems in the implementation: 1) it needs to perform the and operations repeatedly; 2) computing the orthogonal projection matrix takes up an memory. When becomes larger, especially for hyperspectral images with tens or hundreds of bands, the computational burden is huge and unbearable. So in the following, we try to reduce the computational complexity and to save the storage memory simultaneously.
It means that the vector generated by the times Kronecker product of a unit vector is still with a unit length. In this way, (16) can be simplified as
(19) 
According to (17), the new tensor can be updated by
(20) 
where we introduce an auxiliary tensor, denoted by
(21) 
For simplicity, let
(22) 
then we can have
(23) 
Since
(24) 
we denote
(25) 
where is the vector generated by the th frontal slice , i.e., .
Recalling the definition of the mode product, (27) can be equivalent to
(28) 
so the new updated tensor can be expressed as
(29) 
Thus, we obtain a more compact representation for the tensor update. We name it the improved strategy, as opposed to the originally proposed one described in step (1) step (4). It should be noted that the subtraction operation in (29) corresponds to the orthogonal complement projection operation in (16).
Interestingly, we can compare (29) with the update formula of PSA defined in (15), which we can restate here
(30) 
since .
In a sense, two strategies shown in (29) and (30) differ in the order in which they perform the orthogonal complement projection and the mode product operation. PSA generates the orthogonal complement projection matrix first and then calculate the mode product to update a new tensor. In contrast, NPSA first obtains an auxiliary tensor via the mode product, followed by the orthogonal complement projection operation.
IiiC Complexity Comparison
Here, we give a detailed comparison for the two different strategies from two aspects, including the required maximum storage memory and the computational complexity.
On the one hand, the original strategy needs to calculate a largescale orthogonal projection matrix of size and rearrange the elements repeatedly, while the improved version only takes up memory to store the auxiliary tensor, which can greatly save the memory.
On the other hand, the computational complexity of both step (3) and step (4) is , which is very timeconsuming, especially when is large. In contrast, the improved version can have a lower computational complexity. It can be checked that the computational complexity to update the auxiliary tensor in (29) is . Table I concludes the complexity comparison of the two strategies.
original  improved  
maximum storage memory  
computational complexity 
Comparison of the two strategies with respect to maximum storage memory and computational complexity.
Finally, the pseudocode of NPSA is summarized in Algorithm 1.
Some comments are described as follows. Generally, the stop conditions in step (6) include error tolerance and maximum times . In this paper, is set to 0.0001, and is set to 50. is the final nonorthogonal principal skewness transformation matrix.
Iv Theoretical Analysis
In the above section, we have demonstrated that NPSA outperforms PSA using a simple example. Now,
to theoretically illustrate why the former can obtain the more accurate solutions, we present the following lemma.
Lemma 1
Consider an columnfull rank matrix , it holds that the orthogonal complement of the space spanned by the Kronecker product of always contains the space spanned by the Kronecker product of its orthogonal complement operator, which can be expressed as follows
(31) 
for .
Proof IV.1
We start by defining
(32) 
Denote , and we have
(33) 
Similarly,
(34) 
Then
(36) 
It can be easily verified that since is a projection matrix, and thus
(37) 
Furthermore, we can obtain the dimensionality of the space spanned by and . Theoretically, the rank of a matrix can be defined as the dimensionality of the range of , which follows
(38) 
According to (5), we can have
(39) 
A vector space is the direct sum of the subspace and its orthogonal complement space and the dimensionality will satisfy the following relationship
(40) 
Assume that subspace is spanned by the columns of matrix and therefore
(41) 
In a similar way ,we can have
(43) 
Since and , according to the binomial theorem, the following inequality can be deduced
(44) 
which is consistent with the conclusion in Lemma 1.
Now, we reconsider (29) and (30) in the dimensional space, and we can have
(45) 
and
(46) 
where we utilize the property (8).
Based on Lemma 1, it always holds that . This implies that the strategy of NPSA can enlarge the search space of the eigenvector to be determined in each unit, instead of being restricted in the orthogonal complement space of the previous one as in PSA. Meanwhile, similar to PSA, NPSA can also prevent the solution from converging to the same one that has been determined before becaues of the use of the orthogonal complement operator in the dimensional space given by (16).
V Experiments
In this section, a series of experiments on both simulated and real multi/hyperspectral data are conducted to evaluate the performance of NPSA, and several widely used algorithms are compared meanwhile. All the algorithms are programmed and implemented in MATLAB Rb on a laptop of GB RAM, Inter(R) Core (TM) iU CPU, @GHZ.
Va Experiments On Blind Image Separation
To evaluate the separation performance of NPSA, we apply it to the blind image separation (BIS) problem. BIS is an important application for ICA, and four algorithms designed for this problem are compared with our method, including the original PSA [24], EcoICA (a thirdorder version of JADE)[19], subspace fitting method (SSF)[20] and Distinctively Normalized Joint Diagonalization (DNJD)[21] .
The aim of BIS is to estimate the mixing matrix, denoted by
(or its inverse matrix, i.e., the demixing matrix, denoted by ) when only the mixed data is known. Here, three grayscale images with a size of are selected as the source images, as shown in the first column in Fig 4. The mixing matrix can be generated by the function in MATLAB software, and the mixed images are shown in the second column in Fig 4. Then we apply these different algorithms to estimate the mixing (or the demixing) matrix and to obtain the separated images. The results are shown in Fig 4. In order to ensure the reliability of the conclusion, we also conduct the other four combinations, in each of which we randomly select three different source images from each other. Finally, several indices are further computed to evaluate their accuracy as follows.VA1 intersymbolinterference ()
This index [34] is to measure the separation performance. After estimating the demixing matrix (and the mixing matrix is known), let , the ISI index is defined as
(47) 
It is obvious that if , is an identity matrix, so the ISI is equal to zero. The smaller the ISI is, the better the algorithm performs. We take the average value of 10 runs as the result. The comparison between these methods is listed in Table II.
As can be seen, NPSA performs better than the others in combination 1, 2, 4 and 5, especially in combination 5. EcoICA is slightly superior to NPSA in combination 3.
Combination  EcoICA  DNJD  SSF  PSA  NPSA 
1  0.0043  0.2375  0.1123  0.0061  
2  0.0309  0.3861  0.0656  0.0363  
3  0.1353  0.0179  0.0186  0.0151  
4  0.0309  0.1119  0.1238  0.0335  
5  0.0027  0.l923  0.0458  0.0035  0.0004 
Comparison of EcoICA, DNJD, SSF, PSA and NPSA for the ISI index of five different combinations. An average result of ten runs is computed.
Combination  EcoICA  DNJD  SSF  PSA  NPSA 
1  
2  
3  
4  
5 
Comparison of EcoICA, DNJD, SSF, PSA and NPSA for the TMSE index of five different combinations. An average result of ten runs is computed.
VA2 Total mean square error ()
Combination  EcoICA  DNJD  SSF  PSA  NPSA  

0.9972  0.9230  0.9978  0.9969  0.9997  
0.9992  0.8596  0.9944  0.9992  0.9992  
1.0000  0.9637  0.9969  0.9998  1.0000  

1.0000  0.9611  0.9963  0.9990  0.9985  
0.9553  0.8848  0.9872  0.9580  0.9988  
0.9988  0.9517  0.9989  0.9986  0.9989  

0.9916  0.9451  0.9909  0.9902  0.9920  
1.0000  0.9760  0.9997  0.9999  1.0000  
0.9977  0.9252  0.9871  0.9972  0.9995  

0.9988  0.9623  0.9972  0.9987  0.9989  
0.9553  0.8300  0.9632  0.9611  0.9989  
1.0000  0.9466  0.9935  0.9992  0.9983  

1.0000  0.6621  0.9616  0.9998  1.0000  
1.0000  0.9906  0.9922  0.9998  1.0000  
0.9972  0.9156  0.9968  0.9966  0.9997 
Comparison of EcoICA, DNJD, SSF, PSA and NPSA for the correlation coefficient index of five different combinations. An average result of ten runs is computed.
To compute the error between the source image and the separated image , we use the mean square error (MSE) index. The images are first normalized into one length to eliminate the influence of the magnitude of the images, which can be implemented by . Then, assume that the number of the pixels is , the MSE index can be calculated by
(48) 
after computing the respective error of the three images, we then compute the total mean square error (TMSE)
(49) 
Similar to the ISI index, the smaller the TMSE is, the better the algorithm performs. We take the average value of 10 runs as the result. The comparison between these methods is listed in Table III.
The NPSA algorithm has the smallest TMSE in combination 1, 2, 4 and 5, while SSF is slightly better than NPSA in combination 3.
VA3 correlation coefficient
Both ISI and TMSE mainly focus on the overall performance. Here, we use the correlation coefficient to measure the similarity between and , in detail. The two images are first vectorized into the vector and (Note that the normalization operation used in TMSE index have no impact on the final result since the correlation coefficient mainly measure the angle between the two vectors). Then the correlation coefficient can be calculated by
(50) 
The results of the five combinations are listed in Table IV. It can be found that the correlation coefficient of NPSA is higher than that of the others in most of the results and has a more stable performance meanwhile.
Eventually, combining the comparison results of several indices, we can conclude that the proposed algorithm obtains a more accurate and robust performance in terms of BIS application.
VB Experiments On Multispectral Data
In this experiment, a m resolution Landsat image embedded in the Environment for Visualizing Images (ENVI) software is selected to evaluate the performance of NPSA. Three algorithms, including the classical FastICA, PSA and EcoICA, are compared in the following experiments. The other two algorithms compared in the BIS experiments are not included since they have either an unstable performance or an unbearable computational complexity for real multi/hyperspectral data.
The dataset contains six bands in m resolution, with band numbers  and (ranging from to um). Band is a thermal band ( um) with a much lower spatial resolution ( m). Thus, it is usually not included. A subscene with a pixel size is selected as the test data and the true color image is shown in Fig. 5. In this area, the main land cover types include vegetation, water, bareland, etc.
In our experiment, we set the number of independent components to , i.e., we select the fullband image to conduct the test. Eventually, we can obtain principal skewness components of the image and the results are shown in Fig. 6. IC are almost the same, which correspond to three main land cover types: vegetation, bareland and water, respectively. By inspection, the result of NPSA in Fig. 6 (IC) and Fig. 6 (IC) is also superior to that of the other algorithms.
It is worth taking IC and IC as an example to compare the differences of these algorithms once again. IC mainly corresponds to the cultivated farmland (e.g., framed in the blue rectangular), while IC corresponds to the uncultivated bareland (e.g., framed in the red rectangular). Their spectral curves are shown in Fig. 7. Because of the orthogonal constraint in PSA, EcoICA and FastICA, for the objects with similar spectra, there is often only one of them being highlighted. When the farmland is extracted as the independent component in IC, the uncultivated bareland will be suppressed in the later component, as shown in Fig. 6, Fig. 6 and Fig. 6. While in NPSA, because the restriction of orthogonal constraint is relaxed, the object that is spectrally similar to the previous components can still be likely to be detected. As shown in Fig. 6, the uncultivated bareland can still be efficiently extracted in NPSA.
To further demonstrate the advantage of NPSA, we conduct a quantitative comparison by calculating the skewness value of each independent component, and the results are plotted in Fig. 8. As can be seen, the skewness value of NPSA in each independent component is equal to or larger than that of the others, which implies that NPSA can obtain the more reasonable maxima in each unit and has a better overall performance.
VC Experiments On Hyperspectral Data
Here, the hyperspectral image data we used to test the method is from the Operational Modular Imaging SpectrometerII, which were acquired by the Aerial Photogrammetry and Remote Sensing Bureau in Beijing, China, in . It includes bands from visible to thermal infrared with about m spatial resolution and nm spectral resolution and has pixels in each band. Since the signaltonoise ratio is low in bands , we select the remaining bands as our test data. The true color image is shown in Fig. 9. The red rectangular framed in Fig. 9 is an area of bluepainted roof. After a field investigation, it was found that the roof was made from three different materials although there is no obvious difference in the visible band.
We display the results of feature extraction in Fig. 10 and plot the curves of skewness in Fig. 11. All algorithms can automatically extract the three different materials as the independent components. The skewness comparison both attached in Fig. 10 for the three ICs and in Fig. 11 for all the ICs can also demonstrate that NPSA outperforms the other three algorithms. It can be observed that in Fig. 11, the skewness of IC and IC extracted by EcoICA is slightly larger than that of NPSA. However, we can still conclude that NPSA has the better overall performance.
Besides, a time efficiency comparison is also conducted in this experiment. The number of ICs, i.e, , ranges from to and the time curve as a function of is plotted in Fig. 12. As increases, the time consumption of NPSA is greater than that of PSA. This is because we need to update the tensor in each unit of NPSA while this repeated computation in PSA can be simplified [24]. However, the time of NPSA for a fullband data set, i.e., , is about seconds, which is still efficient and acceptable for the real hyperspectral image. Note that since the time of EcoICA for a fullband data set is about seconds, which is much larger than that of the other three algorithms, so we do not take it into our comparison in this experiment. To sum up, our algorithm can obtain a higher accuracy in the sacrifice of some time efficiency at an acceptable level.
Vi Conclusion
Orthogonal complement constraint is a widely used strategy to prevent the solution to be determined from converging to the previous ones[24, 25, 18]. However, such a constraint can be irreconcilably contradicted with the inherent nonorthogonality of supersymmetric tensor. In this paper, originated from PSA, we have proposed a new algorithm, which is named nonorthogonal principal skewness analysis (NPSA). In NPSA, a more relaxed constraint than the orthogonal constraint is proposed to search for the locally maximum skewness direction in a larger space, and thus we can obtain the more accurate results. A detailed theoretical analysis is also presented to justify its validity. Furthermore, it is interesting to find that the differences of PSA and NPSA lie in the order in which they perform the orthogonal complement projection and the mode operation. We first apply the algorithm into the BIS probelm and several accuracy metrics evaluated in our experiments show that NPSA can obtain a more accurate and robust result. Experiments for real multi/hyperspectral data also demonstrate that NPSA outperforms the other algorithms in extracting the ICs of the image.
On the one hand, our method can be extended to fourthorhigherorder case naturally. Both PSA and NPSA focus on the skewness index, which may not be always the best choice to describe the statistical structure of the data. Kurtosis and other indices can be considered as the alternative. On the other hand, NPSA needs to update the coskewness tensor in each unit, which makes it slightly more timeconsuming than PSA. So in the future, more efficient optimization methods will be worth studying.
References
 [1] X. Geng, K. Sun, L. Ji, Y. Zhao, and H. Tang. Optimizing the endmembers using volume invariant constrained model. IEEE Transactions on Image Processing, 24(11):3441–3449, 2015.
 [2] Yuntao Qian, Fengchao Xiong, Shan Zeng, Jun Zhou, and Yuan Yan Tang. Matrixvector nonnegative tensor factorization for blind unmixing of hyperspectral imagery. IEEE Transactions on Geoscience and Remote Sensing, 55(3):1776–1792, 2017.
 [3] Chein I Chang, Xiao Li Zhao, M. L. G Althouse, and Jeng Jong Pan. Least squares subspace projection approach to mixed pixel classification for hyperspectral images. IEEE Transactions on Geoscience and Remote Sensing, 36(3):898–912, 1998.
 [4] Baofeng Guo, S. R. Gunn, R. I. Damper, and J. D. B. Nelson. Customizing kernel functions for svmbased hyperspectral image classification. IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society, 17(4):622–629, 2008.
 [5] Z. Zou and Z. Shi. Random access memories: A new paradigm for target detection in high resolution aerial remote sensing images. IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society, PP(99):1–1, 2018.
 [6] Ting Wang, Bo Du, and Liangpei Zhang. A kernelbased targetconstrained interferenceminimized filter for hyperspectral subpixel target detection. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 6(2):626–637, 2013.
 [7] Weiwei Sun and Qian Du. Graphregularized fast and robust principal component analysis for hyperspectral band selection. IEEE Transactions on Geoscience and Remote Sensing, PP(99):1–11, 2018.
 [8] H Hotelling. Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24(6):417–520, 1933.

[9]
Bernhard Schölkopf, Alexander Smola, and KlausRobert Müller.
Kernel principal component analysis.
In
International Conference on Artificial Neural Networks
, pages 583–588. Springer, 1997.  [10] Andrew A Green, Mark Berman, Paul Switzer, and Maurice D Craig. A transformation for ordering multispectral data in terms of image quality with implications for noise removal. IEEE Transactions on geoscience and remote sensing, 26(1):65–74, 1988.
 [11] Xiurui Geng, Kang Sun, Luyan Ji, and Yongchao Zhao. A highorder statistical tensor based algorithm for anomaly detection in hyperspectral imagery. Scientific reports, 4:6869, 2014.
 [12] Yanfeng Gu, Ying Liu, and Ye Zhang. A selective kpca algorithm based on highorder statistics for anomaly detection in hyperspectral imagery. IEEE Geoscience and Remote Sensing Letters, 5(1):43–47, 2008.
 [13] Shih Yu Chu, Hsuan Ren, and Cheini Chang. Highorder statisticsbased approaches to endmember extraction for hyperspectral imagery. Proceedings of SPIE  The International Society for Optical Engineering, 6966, 2008.
 [14] Hsuan Ren, Qian Du, Jing Wang, and CheinI Chang. Automatic target recognition for hyperspectral imagery using highorder statistics. Aerospace and Electronic Systems IEEE Transactions on, 42(4):1372–1385, 2006.

[15]
Xiurui Geng, Luyan Ji, Yongchao Zhao, and Fuxiang Wang.
A small target detection method for the hyperspectral image based on higher order singular value decomposition (hosvd).
IEEE Geoscience and Remote Sensing Letters, 10(6):1305–1308, 2013.  [16] A. Hyvarinen and E. Oja. Independent component analysis: algorithms and applications. Neural Networks, 13(4):411–430, 2000.
 [17] J. F Cardoso and A Souloumiac. Blind beamforming for non gaussian signals. IEE Proc.F, 140(6):362–370, 1993.
 [18] Aapo Hyvarinen. Fast and robust fixedpoint algorithms for independent component analysis. IEEE transactions on Neural Networks, 10(3):626–634, 1999.

[19]
Liyan Song and Haiping Lu.
Ecoica: Skewnessbased ica via eigenvectors of cumulant operator.
In
Asian Conference on Machine Learning
, pages 445–460, 2016.  [20] AJ Van Der Veen. Joint diagonalization via subspace fitting techniques. In Acoustics, Speech, and Signal Processing, 2001. Proceedings.(ICASSP’01). 2001 IEEE International Conference on, volume 5, pages 2773–2776. IEEE, 2001.
 [21] Fuxiang Wang, Zhongkan Liu, and Jun Zhang. Nonorthogonal joint diagonalization algorithm based on trigonometric parameterization. IEEE Transactions on Signal Processing, 55(11):5299–5308, Nov 2007.
 [22] A. Yeredor. Nonorthogonal joint diagonalization in the leastsquares sense with application in blind source separation. IEEE Transactions on Signal Processing, 50(7):1545–1553, Jul 2002.
 [23] E Oja and Zhijian Yuan. The fastica algorithm revisited: Convergence analysis. IEEE Transactions on Neural Networks, 17(6):1370–1381, 2006.
 [24] Xiurui Geng, Luyan Ji, and Kang Sun. Principal skewness analysis: algorithm and its application for multispectral/hyperspectral images indexing. IEEE Geoscience and Remote Sensing Letters, 11(10):1821–1825, 2014.
 [25] Xiurui Geng, Lingbo Meng, Lin Li, Luyan Ji, and Kang Sun. Momentum principal skewness analysis. IEEE Geoscience and Remote Sensing Letters, 12(11):2262–2266, 2015.
 [26] Lingbo Meng, Xiurui Geng, and Luyan Ji. Principal kurtosis analysis and its application for remotesensing imagery. International Journal of Remote Sensing, 37(10):2280–2293, 2016.
 [27] Ariel Jaffe, Roi Weiss, and Boaz Nadler. Newton correction methods for computing real eigenpairs of symmetric tensors. 2017.
 [28] Animashree Anandkumar, Rong Ge, Daniel Hsu, Sham M Kakade, and Matus Telgarsky. Tensor decompositions for learning latent variable models. The Journal of Machine Learning Research, 15(1):2773–2832, 2014.
 [29] Tamara G Kolda and Brett W Bader. Tensor decompositions and applications. SIAM review, 51(3):455–500, 2009.
 [30] Charles F. Van Loan. The ubiquitous kronecker product. Journal of Computational and Applied Mathematics, 123(1):85–100, 2000.
 [31] XianDa Zhang. Matrix analysis and applications. Cambridge University Press, 2017.
 [32] LekHeng Lim. Singular values and eigenvalues of tensors: a variational approach. In Computational Advances in MultiSensor Adaptive Processing, 2005 1st IEEE International Workshop on, pages 129–132. IEEE, 2005.
 [33] Liqun Qi. Eigenvalues of a real supersymmetric tensor. Journal of Symbolic Computation, 40(6):1302–1324, 2005.
 [34] Eric Moreau and Odile Macchi. A one stage selfadaptive algorithm for source separation. In Acoustics, Speech, and Signal Processing, 1994. ICASSP94., 1994 IEEE International Conference on, volume 3, pages III–49. IEEE, 1994.
Comments
There are no comments yet.