I Introduction
HYPERSPECTRAL image (HSI) acquired by spaceborne or airborne sensors, such as AVIRIS, HyMap, HYDICE, and Hyperion, typically record material’s hundreds of thousands of spectral wavelengths for each pixel in the image, which has opened new perspectives in many applications in remote sensing [1, 2, 3]
. Since the subtle differences in ground covers can be captured by different spectral signatures, hyperspectral imagery is a wellsuited technology for discriminating materials of interest. Although the rich spectral signatures can provide useful information for data analysis, the high dimensionality of HSI data presents some new challenges: (i) increasing the burden of data transmission and storage; (ii) leading to the curse of dimensionality problem
[4]which will reduce the generalization capability of classifiers and deteriorate the classification performance, especially when the available labeled samples are limited. Owing to (i) the dense sampling of spectral wavelengths and (ii) the spectral reflectance of most materials changes only gradually over certain spectral bands, many contiguous bands are highly correlated and not all features (or spectral bands) are expected to contribute useful information for the data classification/analysis task at hand. As one of the typical method to alleviate this problem, dimensionality reduction is widely used as a preprocessing step to remove the highly correlated and redundant measurements in the original highdimensional HSI spectral space and preserve essential information in a lowdimensional subspace. It has attracted increasing attentions in recent years.
Generally speaking, dimensionality reduction of HSI data can be divided into two categories: feature selection
[5, 6, 7, 8] and feature extraction [9, 10, 11]. The former tends to select a small subset of the most representative bands from the original bands, whereas the latter aims to find an optimal transformation matrix to project the original highdimensional spectral features into a lowdimensional subspace. Feature selection can only select existing bands from HSIs, whereas feature extraction can use entire bands to generate more discriminative features. In [12], a joint feature extraction and feature extraction method for HSI representation and classification has been developed. In this paper we mainly focus on employing feature extraction to reduce the feature dimensions of HSIs. Based on whether or not the label information is used, the feature extraction can be classified into unsupervised approaches and supervised approaches.One of the most widely applied unsupervised dimensionality reduction techniques in HSI analysis is the principal component analysis (PCA) [13] and its variants [14, 15, 16, 17, 10]
. Without any label information, PCA tends to find orthogonal transformations to maximize the total variance of the projected data. Different from preserving the largest data variance as in PCA, independent component analysis (ICA)
[18]tries to find the independent components by maximizing the statistical independence of the estimated components. Recently, some nonlinear methods based on manifold learning
[19, 20]have been used to compute the essential embedded lowdimensional space of observed high dimensional data
[21], e.g., locally linear embedding (LLE) [22, 23], neighborhood preserving embedding (NPE) [24], locality preserving projection (LPP) [25], and most recently proposed local pixel neighborhood preserving embedding (LPNPE) [26]. Other efficient unsupervised feature extraction and learning methods also include intrinsic representation [27], subfeature learning [28], and latent subclass learning [29]. Supervised dimensionality reduction algorithms leverage the supervised information, i.e., the labels, to learn the dimensionality reduced feature space. The most representative works include Fisher’s linear discriminant analysis (LDA) [14] and Local Fisher discriminant analysis (LFDA) [30].Most of the above feature extrication methods use only spectral signature of each pixel and the dimensionality reduction models cannot directly use spatial information of HSIs, which has been proven to be very effective to improve the HSI representation and classification accuracy [1, 31, 32, 33]. In [26], the spatial information is applied to spatial filtering (as a preprocessing) as well as modeling the spatial neighboring pixel correlations. Wen et al. proposed to incorporate the spatial information, e.g., texture or morphological features, into the framework of orthogonal nonnegative matrix factorization [34]. The approach of [35] presents a novel spectralspatial feature based similarity measurement which can be incorporated into existing dimensionality reduction methods including linear or nonlinear techniques. In [36, 37], spatial information is used to regularize the spectral representation.
Ia Motivation and Contributions
Conventional methods usually learn a unified projection for HSI feature extraction [30, 38, 39, 40]. However, different regions in an HSI may correspond to different objects, whose spectral features are diverse. Therefore, a reasonable way is to learn different projection matrices for different regions. Image segmentation can be seen as an exhaustive partitioning of the observed image into many different regions, and each of which is considered to be homogeneous [41]. These regions form a segmentation map that can be used as spatial structures for the spectralspatial classification.
In this paper, we advocate a simple yet very effective unsupervised feature extraction method based on superpixelwise PCA, which is denoted as SuperPCA. It can learn the intrinsic lowdimensional features of different regions of the HSI data by performing PCA on each homogeneous region obtained by superpixel segmentation, as shown in Fig. 1
. An HSI is firstly divided into many homogeneous regions via superpixel segmentation, which are denoted by matrices whose columns are the spectral vectors of pixels. PCA is applied to these
highdimension matrices to obtain the dimensionality reduced ones. Finally, we rearrange and combine all these low dimensional matrices to form the dimensionality reduced HSIs.In an attempt to make full use of the spatial information contained in the HSI cube, we further develop a multiscale segmentation based SuperPCA model, namely MSuperPCA, which can effectively integrate multiscale spatial information to obtain the optimal classification result by decision fusion. Fig. 2 demonstrates the schematic of our proposed multiscale SuperPCA method. We first apply entropy rate superpixel (ESR) to obtain multiscale superpixel segmentations (by setting different superpixel numbers) based on the first principal component of the input HSIs. Then, for each scale, the proposed SuperPCA based unsupervised dimensionality reduction method is used to obtain the dimensionality reduced HSIs. Based on the predictions of different scales through support vector machine (SVM) classifier, we generate the final classification result via the majority voting decision fusion strategy.
To the best of our knowledge, this is the first time that a superpixelwise model is adopted for unsupervised dimensionality reduction and classification in hyperspectral imagery. Extensive experimental results demonstrate that, our method is not only simple and intuitive, but also achieves the most competitive HSI classification results as compared with the stateoftheart dimensionality reduction based methods, including some recently proposed supervised feature extraction techniques. When the label information is limited (a small number of labeled training samples, e.g. 5 samples per class), our proposed SuperPCA and MSuperPCA methods obtain even better classification accuracies than the stateoftheart supervised feature extraction techniques.
IB Organization of This Paper
The remainder of the paper is organized as follows. Section II firstly reviews and introduces the ESR superpixel segmentation algorithm. Section III introduces notations and then explains the details of the proposed HSI classification approach based on SuperPCA and the multiscale extension of SuperPCA model. And then, we also give some analysis of the proposed SuperPCA algorithm. Section IV presents the experimental results and analysis. Finally, the concluding remarks are stated in Section V.
Ii Entropy Rate Superpixel Segmentation (ERS)
For a superpixel segmentation algorithm, it should have the following characteristics. Firstly, superpixels should adhere well to the object boundaries. Secondly, as a preprocessing process, superpixel segmentation should be of low computational complexity itself.
Recently, graph structure based segmentation approaches are widely used in superpixel segmentation [42] and applications [43]. A typical superpixel segmentation technique is the eigenbased solution to the normalized cuts (NCuts) [44]. However, it needs to construct a very large graph () whose vertices (V) are the pixels in the image to be segmented, the edge set (E) consists of the pairwise similarities by the weight function
. Therefore, performing eigenvalue decomposition on such a large similarity matrix is very time consuming, which will take several minutes for segmenting an image of moderate size,
e.g., around 500300 pixels. TurboPixel [45] is an efficient alternative to achieve a similar regularity. However, it sacrifices fine image details and results in a low boundary recall. In [46], an ERS segmentation approach is proposed, and the graph is partitioned into a connected subgraph by choosing a subset of edges such that the resulting graph consists of smaller connected components/subgraphs. In the objective function of ERS, it incorporates an entropy rate term and a balancing term to optimize the superpixel segmentation:(1) 
Here, the first term favors the formation of homogeneous and compact clusters, while the second term can be used to encourage the cluster with similar sizes. is used to balance the contributions of the entropy rate term and the balancing term . As described in [47], a greedy algorithm effectively solves the optimization problem in (1). This method is highly efficient, which only takes about 2.5 seconds to segment an image of size 500300 pixels.
Iii Superpixelwise Principal Component Analysis (SuperPCA)
An HSI cube is made up with hundreds of nearly contiguous spectral bands, with high (510 nm) spectral resolution, from the visible to infrared spectrum for each image pixel. Here, , and are the number of image rows, columns and sampled wavelengths, respectively. We can reshape the 3D cube to a 2D matrix, (), in which each column represents one pixel vector that reflects the energy spectrum of the materials within the spatial area covered by the pixel.
Denote the th pixel vector of the observed HSI cube ,
(2) 
PCA performs the dimensionality reduction by computing the lowdimensional representation that maximizes data variance in the dimensionality reduced space. Specifically, it finds a linear mapping from the original dimensional space to a low dimensional space , . Without loss of generality, we denote the transformation matrix by W. That is,
. Mathematically, it aims at finding the linear transformation matrix by solving the following objective function,
(3) 
where stands for the covariance matrix of the data set X, and Tr(X) denotes the trace of an by square matrix X.
Owing to its simplicity, effectiveness, and robustness to noise, PCA has been widely used as a preprocessing step of many HSI based applications. However, in an HSI, there are many homogeneous regions. Within each region, pixels are more likely to be the same class [48, 49, 50, 51]. The global PCA approach considers the entire data space (composed of all the pixel vectors of the HSI cube), and tries to find the best transformation vector for this space. It may ignore the differences of homogeneous regions. As illustrated by a toy example (Fig. 3), we suppose that the data space is formed by class 1 (marked with blue squares) and class 2 (marked with orange squares), which could possibly represent distributions of samples from two different homogeneous regions of HSIs. We can obviously see that the transformation vectors and for class 1 and class 2 are significantly different, and they are also different from the transformation vector w generated for the entire data space. As shown in Fig. 4, we plot the correlation matrices of spectral bands of the entire University of Pavia image as well as some typical homogeneous regions. From this figure, we can learn that the correlation matrices are variant. Therefore, different regions will have varying transformation vectors (see Eq. (3)).
Iiia Generation of Homogeneous Regions
Inspired by the above observation, in this paper we propose a divideandconquer strategy to perform unsupervised feature extraction based on PCA for each homogeneous region. By extracting the same number of principal components (PCs) for each homogeneous region, we can combine them to form the dimensionality reduced HSIs (Fig. 2). In the following, we will introduce the construction of homogeneous regions using superpixel segmentation, which can exhaustively partition the image into many homogeneous regions.
As in many superpixel segmentation based hyperspectral image classification and restoration methods [48, 49, 52, 53], we adopt ERS due to its promising performance in both efficiency and efficacy. Other stateoftheart methods such as simple linear iterative clustering (SLIC) [54] can also be used to replace the ERS. Specially, we first obtain the first principal component of HSIs, , capturing the major information of HSIs. This further reduces the computational cost for superpixel segmentation. And then, we perform ESR on to obtain the superpixel segmentation,
(4) 
where denotes the number of superpixels, and is the th superpixel.
IiiB Multiscale Extension of SuperPCA
By segmenting the HSIs to superpixels, it will be beneficial to exploit rich spatial information about the land surface [52, 32]. However, how to select an optimal value for the number of superpixels is a very challenging problem in actual applications [46]. When the superpixels are too large (by setting a small superpixel number), the resultant undersegmentation can lead to ambiguitylabeled boundary superpixels that require further segmentation. When superpixels are too small (by setting a large superpixel number), the features computed from the oversegmented regions may become less distinctive, making it more difficult to infer correct labels. In addition, as reported in [55], there is no single region size that would adequately characterize the spatial information of HSIs. Inspired by the classifier and decision fusion techniques [56, 57, 58], in this paper we propose the multiscale segmentation strategy to enhance the performance of single scale SuperPCA based method, thus alleviating abovementioned problem. More specifically, the principal component image (the first principal component of HSIs) is segmented into scales. The number of superpixel of the th scale is ,
(5) 
where is the fundamental superpixel number and is set empirically. Since the value of may not be an integer number in , we reset it as . Here, is the number of total pixels in the HSIs.
By taking advantage of the multiscale superpixels, the decision fusion strategy can boost the classification accuracy, especially in conflicting situations. Specifically, we fuse the label information of each test pixel predicted by different multiscale superpixels. That is, given that the fundamental image is segmented to scales, and there will be different classification results for an HSI. Then, we can aggregate the results through an effective decision fusion strategy. In this paper, we leverage the majority voting (MV) based decision fusion strategy due to its insensitivity
to inaccurate estimates of posterior probabilities:
(6) 
where is the class label from one of the possible classes for the test pixel, is the classifier index, represents the number of times that class is predicted in the bank of classifiers, and denotes the indicator function. In Eq. (6), denotes the voting strength of the th classifier. One possible way of performing this adaptive voting mechanism is to weigh a classifier’s vote based on its confidence score, which can be learned from training data. In this paper, we directly use the equal voting strength, .
Fig. 2 shows the framework of the proposed multiscale SuperPCA method for HSI classification. We firstly obtain the first principal component of the input HSIs. Then, it is segmented to multiple scales based on the ESR algorithm [46] with different superpixel numbers. For each scale, we perform PCA dimensionality reduction on each homogeneous region and combine all regions to form the dimensionreduced HSIs. Lastly, we apply SVM classification to each dimensionreduced HSIs and fuse the classification results by majority voting to predict the final labels for testing samples.
IiiC Analysis of the Proposed SuperPCA
Remark 1. Through superpixel segmentation, we can obtain different homogeneous regions, in which pixels are more likely to fall in the same class [48, 49, 50]. By dividing the global HSIs to some small regions, it becomes easier to find the intrinsic projection directions. Fig. 5 shows the ratios between the first and second eigenvalues of PCA (global based) and the proposed SuperPCA on Indian Pines, University of Pavia, and Salinas Scene HSI datasets (for more detailed information about the datasets and the parameter setting of the number of superpixel , please refer to the experimental section). Obviously, the larger the ratio, the more representative and discriminant the primary projected features are. By segmenting the HSIs to different homogeneous regions, SuperPCA gains larger ratio than conventional global PCA method (see the blue and red horizontal lines). It is worth noting that larger
results in smaller homogeneous regions, and each of which has a better consistency. However, it does not necessarily lead to better classification performance. This is because, when the homogeneous region (superpixel) is too small, there will be few data samples in each superpixel, which may cause instability for PCA. From the experimental analysis, it is clear that the divideandconquer strategy of unsupervised feature extraction based on SuperPCA can significantly increase the eccentricity in the direction of the first eigenvector. This further corroborates our claim that a homogeneous region based PCA will be more effective in preserving the essential data information in a low dimensional space.
Remark 2.
There are currently a number of regionbased PCA methods for feature extraction or other related applications. For example, in regionbased PCA face recognition
[59, 60, 61], they divide the whole face image into small patches, and then use PCA to extract the local features that cannot be captured by traditional global face based PCA algorithm; in regionbased PCA image denosing [62], they first divide the whole face image into small patches, and then stack similar noisy patches and apply PCA to exploit these consistency structure among similar patches (thus removing the noise). However, when we directly apply the regular patch based PCA algorithm to hyperspectral images, it cannot fully exploit the rich spatial information contained in HSIs. To this end, we propose a novel regionbased PCA through superpixel segmentation strategy. Table I shows the average overall classification accuracies of three divideandconquer strategies^{1}^{1}1The differences of these three strategies lie in their dividing strategies. In ClusterPCA, all the pixels are clustered by means, and then PCA is applied to each cluster to obtain the dimensionality reduced features. SquarePCA directly performs PCA dimensionality reduction on the squared patches of HSIs., Clustering dependent PCA (ClusterPCA for short), Square patch dependent PCA (SquarePCA for short), and the proposed SuperPCA, with different training sample numbers on the Indian Pines dataset. In addition, the Global PCA method is used as a baseline for comparison. Without loss of generality, we only conduct experiments on this dataset and similar conclusions can be found on the other two datasets.Noise  T.N.s/C  Global PCA  ClusterPCA  SquarePCA  SuperPCA 

5  46.37, 46.94  46.37, 46.94  67.32, 65.64  77.34, 75.85  
10  55.72, 52.06  55.72, 51.83  77.59, 76.89  85.76, 83.79  
20  62.97, 56.88  62.97, 56.65  84.32, 83.97  92.87, 91.94  
30  67.27, 59.50  67.27, 59.37  87.36, 87.02  94.62, 93.78  
5  35.25, 36.17  37.08, 36.09  64.68, 63.49  74.26, 74.20  
10  38.63, 37.68  39.76, 39.08  75.95, 75.14  82.52, 82.18  
20  44.40, 39.65  44.40, 41.19  81.71, 81.33  90.42, 89.05  
30  45.51, 41.13  45.51, 42.04  84.00, 83.86  93.36, 90.84 
Indian Pines  University of Pavia  Salinas Scene  

Class Names  Numbers  Class Names  Numbers  Class Names  Numbers 
Alfalfa  46  Asphalt  6631  Brocoli_green_weeds_1  2009 
Cornnotill  1428  Bare soil  18649  Brocoli_green_weeds_2  3726 
Cornmintill  830  Bitumen  2099  Fallow  1976 
Corn  237  Bricks  3064  Fallow_rough_plow  1394 
Grasspasture  483  Gravel  1345  Fallow_smooth  2678 
Grasstrees  730  Meadows  5029  Stubble  3959 
Grasspasturemowed  28  Metal sheets  1330  Celery  3579 
Haywindrowed  478  Shadows  3682  Grapes_untrained  11271 
Oats  20  Trees  947  Soil_vinyard_develop  6203 
Soybeannotill  972  Corn_senesced_green_weeds  3278  
Soybeanmintill  2455  Lettuce_romaine_4wk  1068  
Soybeanclean  593  Lettuce_romaine_5wk  1927  
Wheat  205  Lettuce_romaine_6wk  916  
Woods  1265  Lettuce_romaine_7wk  1070  
BuildingsGrassTreesDrives  386  Vinyard_untrained  7268  
StoneSteelTowers  93  Vinyard_vertical_trellis  1807  
Total Number  10249  Total Number  42776  Total Number  54129 
It is should be noted that we use two different classifiers to conduct the classification, i.e., SVM and nearest neighbor (NN). For each result in the bracket, the left is based on the SVM classifier while the right is based on the NN classifier, respectively. To evaluate the performance, we randomly choose samples from each class to form the training set^{2}^{2}2At a maximum half of the total samples in Grasspasturemowed and Oats classes, which have relatively small sample sizes, are chosen., and the rest of the samples for testing. Due to space limitation, we use “T.N.s/C” to denote training numbers in each class in the table. In comparison to ClusterPCA and SquarePCA, the proposed SuperPCA method is more efficient. Global PCA and ClusterPCA have the similar results, which indicates that the preprocessing of clustering is invalid. This is because ClusterPCA does not use the spatial information, and considers each pixel as an isolated data sample. In contrast, SquarePCA and SuperPCA leverage the spatial information inside a square patch or a superpixel region, thus leading to better performance. Such advantage becomes more obvious in the case of noise presence. To demonstrate this, we add additive white Gaussian noise (AWGN) with the variance of to the original HSIs. Please refer to the third block in Table I. The performance of ClusterPCA drops drastically when adding noise, while SquarePCA and SuperPCA methods are less affected by the noise. For example, when the noise level is and the number of training samples per class is 30, the classification accuracy of ClusterPCA is less than 50%, while SquarePCA and SuperPCA can go beyond 80%. Our SuperPCA method even reaches 93.36% (for SVM classifier) and 90.48% (for NN classifier). In all cases, our method achieves the best performance. In comparison to the SquarePCA method, which also takes into account the spatial information, our proposed SuperPCA method also yields significant performance gains, with an average of 8% increase no matter what kind of classifier is used. We attributes this superiority of SuperPCA over SquarePCA to that pixels in a superpixel are much more like to be the same class than those in a regular patch, and our method can exploit the spatial information more effectively. In summary, clearly demonstrates the robustness of SuperPCA to noise in HSIs for image classification.
Iv Experimental Results and analysis
In this section, we first introduce the three HSI datasets used in our experiments. Then, we assess the impact of the number of superpixels and the reduced dimension on the classification performance using SuperPCA. The comparison results with the stateoftheart dimensionality reduction approaches are presented.
Iva Datasets and Experimental Procedure
In order to evaluate the proposed SuperPCA method, we use three publicly available HSI datasets^{3}^{3}3http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes.

The first HSI dataset is the Indian Pine, which is acquired by the AVIRIS sensor in June 1992. The scene is with 145145 pixels and 220 bands in the 0.42.45 m region covering the agricultural fields with regular geometry. In this paper, 20 low SNR bands are removed and a total of 200 bands are used for classification. It contains 16 different landcovers, and approximately 10249 labeled pixels are from the groundtruth map.

The second HSI dataset is the University of Pavia, which contains a spatial coverage of 610340 pixels and is collected by the ROSIS sensor under the HySens project managed by DLR (the German Aerospace Agency). It generates 115 spectral bands, of which 12 noisy and waterbands are removed. It has a spectral coverage from 0.430.86 m and a spatial resolution of 1.3 m. Approximately 42776 labeled pixels with nine classes are from the ground truth map.

The third HSI dataset is the Salinas Scene, collected by the 224band AVIRIS sensor over Salinas Valley, California, capturing an area over Salinas Valley, CA, USA. It generates 512217 pixels and 204 bands over 0.42.5 m with spatial resolution of 3.7 m, of which 20 water absorption bands are removed before classification. In this image, there are approximately 54129 labeled pixels with 16 classes sampled from the ground truth map.
For the three datasets, the training and testing samples are randomly selected from the available ground truth maps. The classspecific numbers of labeled samples are shown in Table II. To evaluate the performance of our proposed SuperPCA algorithm, we randomly choose samples from each class to build the training set, leaving the rest samples to form the testing set. For some classes, e.g., Grasspasturemowed and Oats in the Indian Pines image, which have a few labeled samples, we only select a maximum half of the total samples in them. To avoid any bias, all the experiments are repeated 10 times, and we report the average classification accuracy.
We compare the proposed methods with two baseline methods (raw spectral features based and PCA method), as well as the stateoftheart dimensionality reduction approaches, including five unsupervised feature extraction methods (PCA [13], ICA [18], LPP [25], NPE [24] and LPNPE [26]), and two supervised feature extraction methods (LDA [14] and LFDA [30]). Similar to many previous representative works [31, 63, 64], three measurements, overall accuracy (OA), average accuracy (AA) and Kappa, are used to evaluate the performance of different dimensionality reduction algorithms for HSI classification. Similar to [25], all abovementioned feature extraction methods are performed on the filtered data using a 55 weighted mean filter, and then SVM classifier is applied to filtered data. It is worth noting that for all the comparison methods, they all go through these two procedures. Firstly, they extract the features of the input HSIs with an unsupervised or supervised manner, and then the supervised SVM classifier is adopted to test their classification performances.
IvB Parameter Tuning
In this subsection, we investigate the influences of (i) the number of superpixels in the SuperPCA approach, (ii) the number of scales of the proposed Multiscale SuperPCA, i.e., the value of the power exponent in Eq. (5) on the performance of the proposed SuperPCA method. Fig. 6 illustrates the OA of SuperPCA as a function of the number of superpixels, , whose value is chosen from {1, 3, 5, 10, 20, 30, 40, 50, 75, 100, 150, 200, 300}. From the parameter tuning results, we can at least draw the following two conclusions:

With the increase of the number of superpixels, it shows that the overall performance will first ascend and then descend. Too large or too small number of superpixel will lead to reduced performance of the proposed SuperPCA method. This is mainly because that too large number of superpixel will result in oversegmented regions and cannot make full use of all the samples belong to the homogeneous area, while a too small number of superpixel will result in undersegmentation and introduce some samples from different homogeneous areas. In addition, when the number of superpixel is too large, each region will have a limited number of pixels, it will not guarantee the stability and reliability of the PCA results, i.e., limited number of samples are not enough to ensure that the real projection.

By setting a proper value of the number of superpixels, the performance is always better than when the number of superpixels is set to 1 (which reduces to the case of traditional global PCA method). It is evident that the proposed SuperPCA, which takes the spatial homogeneity of HSIs into account, is much more effective than the traditional PCA for capturing the intrinsic data structure.
Class Names  = 5  = 4  = 3  = 2  = 1  = 0  = 1  = 2  = 3  = 4  = 5 
Alfalfa  97.83  96.96  96.52  96.96  100  100  100  99.13  99.13  99.13  99.13 
Cornnotill  83.13  84.48  87.83  87.03  91.12  89.67  88.04  82.85  77.52  72.38  60.67 
Cornmintill  77.95  82.23  87.80  86.23  86.18  92.44  88.88  87.63  85.29  81.90  78.30 
Corn  91.64  94.93  92.56  93.43  94.01  95.51  96.28  97.63  96.23  91.84  88.41 
Grasspasture  94.92  96.36  95.92  96.42  97.00  96.56  95.74  95.65  94.02  90.77  89.89 
Grasstrees  98.47  99.61  98.49  98.57  98.47  97.93  96.99  94.93  92.51  82.19  78.50 
Grasspasturemowed  95.71  94.29  89.29  90.00  97.14  97.14  97.14  97.14  97.14  97.14  97.14 
Haywindrowed  99.98  100  99.80  99.80  100  99.64  99.64  99.64  99.64  96.38  93.62 
Oats  94.00  99.00  99.00  98.00  100  100  100  100  100  100  100 
Soybeannotill  84.71  85.12  91.09  91.40  91.19  90.69  90.67  90.10  85.22  77.14  69.72 
Soybeanmintill  90.76  93.30  91.18  93.97  94.72  94.48  94.08  96.02  96.51  89.61  88.34 
Soybeanclean  84.74  90.50  91.08  89.40  90.48  92.97  92.13  94.65  93.36  89.17  79.98 
Wheat  99.20  99.31  99.43  99.43  99.43  99.43  99.43  99.43  98.46  98.29  96.46 
Woods  98.85  98.85  98.84  98.94  98.76  98.89  95.33  91.21  83.65  83.31  72.69 
BuildingsGrassTreesDrives  97.61  98.06  98.43  98.62  97.33  98.65  98.60  98.26  97.36  94.58  90.73 
StoneSteelTowers  97.14  97.30  96.98  97.14  98.41  98.41  98.89  99.52  99.52  98.57  98.57 
OA[%]  90.37  92.15  93.01  93.45  94.26  94.62  93.41  92.49  89.84  84.83  79.30 
AA[%]  92.92  94.39  94.64  94.71  95.89  96.40  95.74  95.24  93.47  90.15  86.38 
Kappa  0.8899  0.9101  0.9200  0.9251  0.9342  0.9383  0.9243  0.9133  0.8821  0.8241  0.7579 
Class Names  = 5  = 4  = 3  = 2  = 1  = 0  = 1  = 2  = 3  = 4  = 5 

Asphalt  71.68  71.99  78.32  79.10  78.06  81.40  86.80  83.46  79.01  82.16  77.30 
Bare soil  74.67  93.97  86.32  92.12  91.22  94.41  92.63  88.63  84.32  83.76  77.13 
Bitumen  81.58  89.98  88.89  89.96  94.40  97.09  96.02  95.22  89.93  93.33  92.35 
Bricks  90.89  92.75  92.90  86.34  89.85  86.21  82.88  78.60  78.70  75.80  70.59 
Gravel  97.89  97.59  97.53  97.00  97.47  96.65  97.05  96.05  91.58  91.59  90.31 
Meadows  69.40  89.11  86.95  90.86  91.02  92.23  89.96  90.13  84.78  83.20  82.11 
Metal sheets  90.78  89.50  94.69  90.76  94.20  94.55  95.66  95.68  95.98  97.45  96.94 
Shadows  73.50  81.07  86.08  88.11  87.58  88.16  92.44  95.52  91.62  91.64  86.66 
Trees  99.97  99.44  99.64  99.65  97.56  98.53  98.85  97.47  97.72  97.26  97.04 
OA[%]  76.74  88.69  86.61  89.36  89.32  91.30  91.23  88.83  84.92  84.97  80.28 
AA[%]  83.37  89.49  90.15  90.43  91.26  92.14  92.48  91.20  88.18  88.47  85.60 
Kappa  0.7030  0.8516  0.8265  0.8608  0.8604  0.8856  0.8851  0.8547  0.8057  0.8064  0.7485 
Class Names  = 5  = 4  = 3  = 2  = 1  = 0  = 1  = 2  = 3  = 4  = 5 
Brocoli_green_weeds_1  100  100  100  100  100  100  100  100  100  98.74  93.75 
Brocoli_green_weeds_2  99.99  99.95  99.8  99.96  99.85  99.78  99.73  97.9  96.96  91.57  82.39 
Fallow  100  100  99.97  99.23  98.21  99.67  98.32  99.22  99.54  96.78  96.03 
Fallow_rough_plow  99.07  99.02  99.05  99.12  98.91  99.16  99.32  99.27  96.74  95.50  93.20 
Fallow_smooth  99.45  99.46  99.44  99.38  98.69  99.37  98.62  98.70  96.72  95.32  89.13 
Stubble  99.88  99.87  99.76  99.82  99.79  98.37  98.33  98.68  94.83  87.45  80.62 
Celery  99.91  99.55  98.06  98.19  98.11  97.78  98.00  97.81  96.73  94.65  81.93 
Grapes_untrained  96.63  93.66  94.87  96.62  96.52  99.39  99.48  98.13  98.93  94.74  97.02 
Soil_vinyard_develop  99.25  99.39  99.54  99.58  99.51  99.02  99.57  95.11  89.46  81.65  68.6 
Corn_senesced_green_weeds  92.55  96.09  97.25  97.08  96.59  97.16  94.35  94.59  89.89  83.56  79.72 
Lettuce_romaine_4wk  98.01  98.30  98.05  98.14  98.61  98.38  98.42  98.72  98.78  95.31  92.36 
Lettuce_romaine_5wk  99.28  99.35  99.27  99.67  98.34  99.80  99.51  99.10  97.53  96.85  91.75 
Lettuce_romaine_6wk  98.21  98.21  98.21  98.19  98.23  98.28  98.09  97.99  97.79  97.81  97.14 
Lettuce_romaine_7wk  95.41  97.83  97.74  98.06  98.25  97.95  98.01  96.29  94.92  93.65  92.57 
Vinyard_untrained  91.17  97.46  97.11  95.96  96.12  99.05  97.43  94.58  83.91  71.9  60.08 
Vinyard_vertical_trellis  99.18  99.33  98.98  99.18  98.92  98.99  98.66  97.97  91.54  87.30  85.17 
OA[%]  97.29  97.78  97.93  98.16  97.99  98.97  98.57  97.26  94.19  88.85  83.12 
AA[%]  98.00  98.59  98.57  98.64  98.42  98.89  98.49  97.75  95.27  91.42  86.34 
Kappa  0.9698  0.9753  0.9770  0.9795  0.9776  0.9886  0.9841  0.9694  0.9349  0.8746  0.8088 
Datasets  T.N.s/C  Raw  PCA  ICA  LPP  NPE  LPNPE  LDA  LFDA  SuperPCA  MSuperPCA 

5  44.88  46.37  45.21  53.58  53.68  67.25  59.95  59.62  77.34  78.68  
Indian  10  55.77  55.72  57.12  70.41  70.49  76.45  69.30  64.91  85.76  87.12 
Pines  20  63.81  62.97  64.41  80.26  79.87  83.51  76.56  74.01  93.90  95.69 
30  68.77  67.27  68.92  84.43  83.98  90.10  89.51  90.19  94.62  96.78  
200  84.01  84.40  82.86  94.31  94.16  97.80  98.55  99.15  97.13  98.25  
5  64.59  65.26  66.58  70.86  68.35  76.12  72.43  74.67  74.39  78.49  
University  10  70.22  70.15  71.39  81.29  80.63  82.55  81.24  78.95  83.42  91.67 
of Pavia  20  75.85  75.91  76.65  86.00  85.69  88.56  85.00  86.98  89.38  95.37 
30  76.45  76.31  76.87  86.90  87.19  90.56  87.91  90.19  91.30  95.68  
200  85.71  85.70  85.79  94.08  93.69  97.50  95.72  98.72  96.99  98.84  
5  81.79  81.87  81.75  85.23  84.86  92.09  89.03  88.83  94.42  95.00  
Salinas  10  85.24  85.28  85.74  88.60  88.99  94.52  91.46  82.77  96.78  98.15 
Scene  20  87.85  87.79  88.08  90.61  90.69  95.89  93.72  93.56  98.37  99.04 
30  88.93  89.24  89.28  91.73  91.69  96.66  95.87  95.89  98.97  99.27  
200  91.48  91.94  91.74  96.18  95.88  99.09  99.87  99.57  99.63  99.70 
Based on the above experiments, we can obtain the optimal fundamental superpixel number for Indian Pines, University of Pavia, and Salinas Scene, which is set to 100, 20, and 100, respectively.
In order to verify the necessity of multiscale fusion, we report the classification results of the proposed SuperPCA approach with different segmentation scales, i.e., is set from 5 to 5. When is set to zero, it means that the input HSI is segmented with the fundamental superpixel number. Tables III, IV, and V tabulate the OA, AA, and Kappa coefficient when the training number is 30, under different segmentation scales for Indian Pines, University of Pavia, and Salinas Scene images, respectively. The best performance for each class is highlighted in bold typeface. From these tables, we learn that even though SuperPCA can obtain the best overall performance when the power exponent is set to zero, i.e., under the fundamental superpixel number , it cannot achieve the best performance in every class (for the sake of convenience, we highlight the best performance for each class (row) in bold). For example, as shown in Table III, when , the OA is obviously inferior to the best scale, i.e. 92.15% to 94.62%. In this case, however, the sixth and eighth classes get the best classification accuracy. Similar results can be also observed in Table IV and Table V (please refer to the case when ). The OA is minimum, but it achieves the best classification performances in the fifth and ninth classes. All these results demonstrate that the superpixel segmentation based on a single scale is not able to fully model the complexity and diversity of HSIs. Therefore, it is an effective and reliable choice to perform multiscale segmentation based decision fusion for HSI classification.
To further verify the usefulness of the multiscale segmentation strategy as well as to assess the influence of the different values of , as shown in Fig. 7, we show the OA results of the proposed multiscale SuperPCA method according to scale number for the images of Indian Pines, University of Pavia, and Salinas Scene. By fusing multiscale segmentation based classification results, we can expect better results than single scale SuperPCA method, i.e., setting to 0. When setting the value of to 4, 6, and 4 for Indian Pines, University of Pavia, and Salinas Scene images, the improvements of single scale SuperPCA method over multiscale SuperPCA method are 0.65%, 4.38%, and 0.30%, respectively. From these results, we observe that the improvement on the University of Pavia image is more obvious than the other two images. This is mainly due to the following two reasons: on the one hand, the single scale SuperPCA performs not very well and has a relatively large space for improvement. On the other hand, the University of Pavia image has richer and more complex texture information, and it is much more difficult for the single scale based segmentation method to capture these useful spatial knowledge. At the same time, we also observe another phenomenon: in order to achieve high classification accuracy, the relatively complex HSIs may require a larger scale number to exploit its spatial information, e.g., the optimal scale number of the University of Pavia image is 6, which is larger than that of the other two HSI datasets.
IvC Comparison Results with Stateofthearts
The classification maps obtained with abovementioned three public HSI datasets for the proposed SuperPCA and MSuperPCA approaches and the comparison methods are given in Fig. 8, Fig. 9, and Fig. 10. Here, we only show the results when the number of training samples is set to 30. From these maps, we can learn that raw spectral features based method, PCA [13], ICA [18], LPP [25], and NPE [24] exhibit higher classification errors than other methods. Among the comparison unsupervised methods, LPNPE [26] achieves the best performance due to its local spatial Cspectral scatter based effective spatial information extraction. By taking advantage of the discrimination information of the labeled samples, these supervised methods (LDA [14] and LFDA [30]) can produce very good results. As for the datasets of Indian Pines and University of Pavia, the proposed SuperPCA and MSuperPCA are clearly better than previous arts. When comparing the classification maps of our methods with LDA [14] and LFDA [30] (please refer to the University of Pavia dataset), it can be observed that our methods can achieve much better results for these large regions (i.e., Bare soil and Meadows), which can be attributed to the efficient segmentation. Through the fusion of multiscale segmentation based classification results, MSuperPCA can improve the result of single scale segmentation based SuperPCA and obtain accurate classification maps (please refer to the edges and the holes of regions).
According to the experimental settings of LPNPE method [26], which can be seen as the best unsupervised feature extraction method for HSI classification to the best of our knowledge, we further randomly choose samples from each class to form the training set to test the comparison results (in terms of OA), respectively. Table VI tabulates the OA performance of different approaches. From the results of each individual method, the OA performance of the proposed SuperPCA is better than others in most instances, and this advantage is particularly evident when the number of training samples is small. With the increase of training number, the performance of supervised methods becomes much better. This can be explained as follows: with the increase of labeled training samples, these supervised methods are able to use more discriminant information from the training samples.
To further utilize the spatial information of HSIs, MSuperPCA is advocated to fuse the decisions of SuperPCA with different segmentation scales. By comparing the last two columns, we can clearly see that the performance of MSuperPCA is better than that of SuperPCA in all cases (regardless of different datasets or different training sample numbers). In particular, the improvement of MSuperPCA over SuperPCA is much more impressive on the University of Pavia dataset, that is over 4% higher accuracy. It is because of the rich texture information contained in that dataset.
The above results show that our proposed method can achieve good performance when the number of training samples are small. At the fourth row of each block, we additionally provide the results when the number of training samples is relatively large, e.g., . In this situation, these supervised methods (LDA [14] and LFDA [30]) can learn more discriminative information from the labeled training data for classification. Therefore, all of them have considerable performance improvements as compared to the situation when the training samples are limited. Nevertheless, the results of SuperPCA, which do not use any label information, are still very competitive in this situation. By fusing multiscale classification results, MSuperPCA can even surpass LDA [14] and LFDA [30] on the University of Pavia and Salinas datasets. This proves the effectiveness of the proposed method once again.
IvD Running Times
In Table VII, we report the run times of extracting the dimensionality reduced features of different algorithms on the Indian Pines, University of Pavia, and Salinas Scene images with different training numbers (). As for the proposed method, we report the whole running time including the segmentation and dimensionality reduction of all superpixels. All methods were tested on MATLAB R2014a using an Intel Xeon CPU with 3.50 GHz and 16G memory PC with Windows platform. The testing time of all methods is measured using a singlethreaded MATLAB process. It should be noted that for these unsupervised methods, PCA [13], ICA [18], LPP [25], NPE [24], LPNPE [26], and our proposed SuperPCA, the running time will not change with the training number per class. PCA, LDA [14], and LFDA [30] show the fastest performance. While LPP [25] and NPE [24] need to construct the large similarity graph and decompose it via SVD, and thus the computational complexities of them are relatively high. With the increase of training numbers, the run times of these supervised methods (LDA [14] and LFDA [30]) will also increase. The timings reveal that although our method requires presegmentation and feature extraction for each region, the run time is still acceptable. Also, thanks to the independence of the dimensionality reduction of each superpixel, we can accelerate the algorithm simply by parallel computation.
Dataset  T.N.s/C  PCA  ICA  LPP  NPE  LPNPE  LDA  LFDA  SuperPCA 

5  0.0076  2.5417  0.2428  0.7018  0.3834  0.0027  0.0167  0.6879  
Indian  10  0.0076  2.5417  0.2428  0.7018  0.3834  0.0047  0.0249  0.6879 
Pines  20  0.0076  2.5417  0.2428  0.7018  0.3834  0.0054  0.0388  0.6879 
30  0.0076  2.5417  0.2428  0.2428  0.3834  0.0057  0.0356  0.6879  
5  0.4004  3.1445  2.9477  6.5322  1.2357  0.0017  0.0053  2.8867  
University  10  0.4004  3.1445  2.9477  6.5322  1.2357  0.0022  0.0110  2.8867 
of Pavia  20  0.4004  3.1445  2.9477  6.5322  1.2357  0.0024  0.0097  2.8867 
30  0.4004  3.1445  2.9477  6.5322  1.2357  0.0024  0.0095  2.8867  
5  0.4145  5.8702  4.7492  9.8365  1.2003  0.0027  0.0174  2.7452  
Salinas  10  0.4145  5.8702  4.7492  9.8365  1.2003  0.0046  0.0260  2.7452 
Scene  20  0.4145  5.8702  4.7492  9.8365  1.2003  0.0066  0.0377  2.7452 
30  0.4145  5.8702  4.7492  9.8365  1.2003  0.0067  0.0391  2.7452 
IvE Discussions
Since the key idea of the proposed methods is to oversegement the HSIs and perform PCA superpixelwisely, how to determine the parameter of the superpixel segmentation model (i.e., the superpixel number in ERS) and the segmentation scales is a a crucial and open problem. In this paper, we set them experimentally to achieve the best performance. In fact, the segmentation scales and superpixel number jointly determine the minimum and maximum homogeneous regions, which can be deduced from the Eq. (5). The searching of optimal segmentation scales and superpixel number can be converted to the problem of setting the size of minimum and maximum homogeneous regions of the given HSIs. Obviously, the size of homogeneous region is determined by the texture information. Therefore, the most direct approach is to detect the edges in a given images through some edge detectors, such as Canny and Sobel. Therefore, we can obtain the texture ratio, which can be used to define the size of homogeneous region.
V Conclusions
In this paper, we propose a simple but very effective technique for unsupervised feature extraction of hyperspectral imagery based on superpixelwise principal component analysis (SuperPCA). By segmenting the entire hyperspectral image (HSI) to many different homogeneous regions, which have similar reflectance properties, it can facilitate the dimensionality reduction process of finding the essential lowdimensional feature space of HSIs. To take full advantage of the spatial information contained in the HSIs, which cannot be extracted using a single scale, we further advocate a decision fusion strategy through multiscale segmentation based on the SuperPCA model (MSuperPCA). Extensive experiments on three standard HSI datasets demonstrate that the proposed SuperPCA and MSuperPCA algorithms outperform the existing stateoftheart feature extraction methods, including unsupervised feature extraction methods as well as supervised feature extraction methods, especially when the training samples are limited. When the number of the training samples is relatively large, the proposed algorithm can still obtain very competitive classification results when compared with these supervised feature extraction methods. Because of inheriting the merits of PCA technology, the proposed SuperPCA can be also used as a preprocessing for many hyperspectral image processing and analysis tasks.
References
 [1] M. Fauvel, Y. Tarabalka, J. A. Benediktsson, J. Chanussot, and J. C. Tilton, “Advances in spectralspatial classification of hyperspectral images,” Proceedings of the IEEE, vol. 101, no. 3, pp. 652–675, 2013.

[2]
L. Zhang, L. Zhang, and B. Du, “Deep learning for remote sensing data: A technical tutorial on the state of the art,”
IEEE Geoscience and Remote Sensing Magazine, vol. 4, no. 2, pp. 22–40, 2016.  [3] J. Ma, J. Jiang, H. Zhou, J. Zhao, and X. Guo, “Guided locality preserving feature matching for remote sensing image registration,” IEEE Trans. Geosci. Remote Sens., 2018, DOI: 10.1109/TGRS.2018.2820040.
 [4] F. Melgani and L. Bruzzone, “Classification of hyperspectral remote sensing images with support vector machines,” IEEE Trans. Geosci. Remote Sens., vol. 42, no. 8, pp. 1778–1790, 2004.
 [5] Q. Du and H. Yang, “Similaritybased unsupervised band selection for hyperspectral image analysis,” IEEE Geosci. Remote Sens. Lett., vol. 5, no. 4, pp. 564–568, 2008.
 [6] H. Yang, Q. Du, and G. Chen, “Unsupervised hyperspectral band selection using graphics processing units,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 4, no. 3, pp. 660–668, 2011.
 [7] Q. Wang, J. Lin, and Y. Yuan, “Salient band selection for hyperspectral image classification via manifold ranking,” IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 6, pp. 1279–1289, June 2016.
 [8] S. Jia, G. Tang, J. Zhu, and Q. Li, “A novel rankingbased clustering approach for hyperspectral band selection,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 1, pp. 88–102, 2016.
 [9] L. M. Bruce, C. H. Koger, and J. Li, “Dimensionality reduction of hyperspectral data using discrete wavelet transform feature extraction,” IEEE Trans. Geosci. Remote Sens., vol. 40, no. 10, pp. 2331–2338, 2002.
 [10] W. Zhao and S. Du, “Spectral–spatial feature extraction for hyperspectral image classification: A dimension reduction and deep learning approach,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 8, pp. 4544–4554, 2016.
 [11] W. Sun, G. Yang, B. Du, L. Zhang, and L. Zhang, “A sparse and lowrank nearisometric linear embedding method for feature extraction in hyperspectral imagery classification,” IEEE Trans. Geosci. Remote Sens., vol. 55, no. 7, pp. 4032–4046, 2017.
 [12] L. Zhang, Q. Zhang, B. Du, X. Huang, Y. Y. Tang, and D. Tao, “Simultaneous spectralspatial feature selection and extraction for hyperspectral images,” IEEE Trans. Cyber., 2016.
 [13] B. Schölkopf, A. Smola, and K.R. Müller, “Nonlinear component analysis as a kernel eigenvalue problem,” Neural computation, vol. 10, no. 5, pp. 1299–1319, 1998.
 [14] S. Prasad and L. M. Bruce, “Limitations of principal components analysis for hyperspectral target recognition,” IEEE Geosci. Remote Sens. Lett., vol. 5, no. 4, pp. 625–629, 2008.
 [15] M. A. Hossain, M. Pickering, and X. Jia, “Unsupervised feature extraction based on a mutual information measure for hyperspectral image classification,” in IGARSS. IEEE, 2011, pp. 1720–1723.
 [16] V. Laparra, J. Malo, and G. CampsValls, “Dimensionality reduction via regression in hyperspectral imagery,” IEEE Journal of Selected Topics in Signal Processing, vol. 9, no. 6, pp. 1026–1036, Sept 2015.
 [17] J. Ma, Y. Ma, and C. Li, “Infrared and visible image fusion methods and applications: A survey,” Inf. Fusion, vol. 45, pp. 153–178, 2019.
 [18] J. Wang and C.I. Chang, “Independent component analysisbased dimensionality reduction with applications in hyperspectral image analysis,” IEEE Trans. Geosci. Remote Sens., vol. 44, no. 6, pp. 1586–1600, 2006.
 [19] S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science, vol. 290, no. 5500, pp. 2323–2326, 2000.

[20]
J. Ma, J. Jiang, C. Liu, and Y. Li, “Feature guided gaussian mixture model with semisupervised em and local geometric constraint for retinal image registration,”
Inf. Sci., vol. 417, pp. 128–142, 2017.  [21] D. Lunga, S. Prasad, M. M. Crawford, and O. Ersoy, “Manifoldlearningbased feature extraction for classification of hyperspectral data: A review of advances in manifold learning,” IEEE Signal Processing Magazine, vol. 31, no. 1, pp. 55–66, 2014.
 [22] C. M. Bachmann, T. L. Ainsworth, and R. A. Fusina, “Exploiting manifold geometry in hyperspectral imagery,” IEEE Trans. Geosci. Remote Sens., vol. 43, no. 3, pp. 441–454, 2005.

[23]
L. Ma, M. M. Crawford, and J. Tian, “Anomaly detection for hyperspectral images based on robust locally linear embedding,”
Journal of Infrared, Millimeter, and Terahertz Waves, vol. 31, no. 6, pp. 753–762, 2010.  [24] X. He, D. Cai, S. Yan, and H.J. Zhang, “Neighborhood preserving embedding,” in ICCV, vol. 2. IEEE, 2005, pp. 1208–1213.
 [25] X. He and P. Niyogi, “Locality preserving projections,” in Advances in neural information processing systems, 2004, pp. 153–160.
 [26] Y. Zhou, J. Peng, and C. P. Chen, “Dimension reduction using spatial and spectral regularized local discriminant embedding for hyperspectral image classification,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 2, pp. 1082–1095, 2015.
 [27] L. Xu, A. Wong, F. Li, and D. A. Clausi, “Intrinsic representation of hyperspectral imagery for unsupervised feature extraction,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 2, pp. 1118–1130, 2016.
 [28] V. Slavkovikj, S. Verstockt, W. D. Neve, S. V. Hoecke, and R. V. D. Walle, “Unsupervised spectral subfeature learning for hyperspectral image classification,” International Journal of Remote Sensing, vol. 37, no. 2, pp. 309–326, 2016.
 [29] W. Wei, Y. Zhang, and C. Tian, “Latent subclass learningbased unsupervised ensemble feature extraction method for hyperspectral image classification,” Remote Sensing Letters, vol. 6, no. 4, pp. 257–266, 2015.
 [30] W. Li, S. Prasad, J. E. Fowler, and L. M. Bruce, “Localitypreserving dimensionality reduction and classification for hyperspectral image analysis,” IEEE Trans. Geosci. Remote Sens., vol. 50, no. 4, pp. 1185–1198, 2012.
 [31] Y. Tarabalka, J. A. Benediktsson, and J. Chanussot, “Spectral–spatial classification of hyperspectral imagery based on partitional clustering techniques,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 8, pp. 2973–2987, 2009.
 [32] L. Fang, S. Li, X. Kang, and J. A. Benediktsson, “Spectral–spatial hyperspectral image classification via multiscale adaptive sparse representation,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 12, pp. 7738–7749, 2014.
 [33] L. Wang, J. Zhang, P. Liu, K.K. R. Choo, and F. Huang, “Spectral–spatial multifeaturebased deep learning for hyperspectral remote sensing image classification,” Soft Computing, vol. 1, no. 21, pp. 213–221, 2016.
 [34] J. Wen, J. E. Fowler, M. He, Y. Q. Zhao, C. Deng, and V. Menon, “Orthogonal nonnegative matrix factorization combining multiple features for spectralspatial dimensionality reduction of hyperspectral imagery,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 7, pp. 4272–4286, July 2016.
 [35] H. Pu, Z. Chen, B. Wang, and G. M. Jiang, “A novel spatialspectral similarity measure for dimensionality reduction and classification of hyperspectral imagery,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 11, pp. 7008–7022, Nov 2014.
 [36] L. Ma, X. Zhang, X. Yu, and D. Luo, “Spatial regularized local manifold learning for classification of hyperspectral images,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 9, no. 2, pp. 609–624, 2016.
 [37] J. Jiang, C. Chen, Y. Yu, X. Jiang, and J. Ma, “Spatialaware collaborative representation for hyperspectral remote sensing image classification,” IEEE Geosci. Remote Sens. Lett., vol. 14, no. 3, pp. 404–408, 2017.
 [38] N. H. Ly, Q. Du, and J. E. Fowler, “Collaborative graphbased discriminant analysis for hyperspectral imagery,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 7, no. 6, pp. 2688–2696, 2014.
 [39] ——, “Sparse graphbased discriminant analysis for hyperspectral imagery,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 7, pp. 3872–3884, 2014.
 [40] W. Li, J. Liu, and Q. Du, “Sparse and lowrank graph for discriminant analysis of hyperspectral imagery,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 7, pp. 4094–4105, 2016.
 [41] N. R. Pal and S. K. Pal, “A review on image segmentation techniques,” Pattern recognition, vol. 26, no. 9, pp. 1277–1294, 1993.
 [42] F. Verdoja and M. Grangetto, “Fast superpixelbased hierarchical approach to image segmentation,” in International Conference on Image Analysis and Processing. Springer, 2015, pp. 364–374.
 [43] Q. Yan, L. Xu, J. Shi, and J. Jia, “Hierarchical saliency detection,” in CVPR, 2013, pp. 1155–1162.
 [44] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 888–905, 2000.
 [45] A. Levinshtein, A. Stere, K. N. Kutulakos, D. J. Fleet, S. J. Dickinson, and K. Siddiqi, “Turbopixels: Fast superpixels using geometric flows,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 12, pp. 2290–2297, 2009.
 [46] M.Y. Liu, O. Tuzel, S. Ramalingam, and R. Chellappa, “Entropy rate superpixel segmentation,” in CVPR. IEEE, 2011, pp. 2097–2104.
 [47] G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher, “An analysis of approximations for maximizing submodular set functionsI,” Mathematical Programming, vol. 14, no. 1, pp. 265–294, 1978.
 [48] J. Li, H. Zhang, and L. Zhang, “Efficient superpixellevel multitask joint sparse representation for hyperspectral image classification,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 10, pp. 5338–5351, 2015.
 [49] L. Fang, S. Li, X. Kang, and J. A. Benediktsson, “Spectral–spatial classification of hyperspectral images with a superpixelbased discriminative sparse model,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 8, pp. 4186–4201, 2015.
 [50] W. Li, S. Prasad, and J. E. Fowler, “Classification and reconstruction from random projections for hyperspectral imagery,” IEEE Trans. Geosci. Remote Sens., vol. 51, no. 2, pp. 833–843, 2013.
 [51] B. Zhang, J. Gu, C. Chen, J. Han, X. Su, X. Cao, and J. Liu, “Onetwoone networks for compression artifacts reduction in remote sensing,” ISPRS Journal of Photogrammetry and Remote Sensing, 2018.
 [52] S. Zhang, S. Li, W. Fu, and L. Fang, “Multiscale superpixelbased sparse representation for hyperspectral image classification,” Remote Sensing, vol. 9, no. 2, p. 139, 2017.
 [53] F. Fan, Y. Ma, C. Li, X. Mei, J. Huang, and J. Ma, “Hyperspectral image denoising with superpixel segmentation and lowrank representation,” Information Sciences, vol. 397, pp. 48–68, 2017.
 [54] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk, “Slic superpixels compared to stateoftheart superpixel methods,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 11, p. 2274, 2012.
 [55] C. Coburn and A. C. Roberts, “A multiscale texture analysis procedure for improved forest stand classification,” International journal of remote sensing, vol. 25, no. 20, pp. 4287–4308, 2004.
 [56] S. Prasad and L. M. Bruce, “Decision fusion with confidencebased weight assignment for hyperspectral target recognition,” IEEE Trans. Geosci. Remote Sens., vol. 46, no. 5, pp. 1448–1456, 2008.
 [57] M. Ding, S. Antani, S. Jaeger, Z. Xue, S. Candemir, M. Kohli, and G. Thoma, “Localglobal classifier fusion for screening chest radiographs,” in Medical Imaging 2017: Imaging Informatics for Healthcare, Research, and Applications, vol. 10138. International Society for Optics and Photonics, 2017, p. 101380A.
 [58] J. Ma, C. Chen, C. Li, and J. Huang, “Infrared and visible image fusion via gradient transfer and total variation minimization,” Inf. Fusion, vol. 31, pp. 100–109, 2016.
 [59] T.X. Jiang, T.Z. Huang, X.L. Zhao, and T.H. Ma, “Patchbased principal component analysis for face recognition,” Computational intelligence and neuroscience, vol. 2017, 2017.
 [60] Y. Zhao, X. Shen, N. D. Georganas, and E. M. Petriu, “Partbased pca for facial feature extraction and classification,” in HAVE. IEEE, 2009, pp. 99–104.
 [61] Y. Gao, J. Ma, and A. L. Yuille, “Semisupervised sparse representation based classification for face recognition with insufficient labeled samples,” IEEE Trans. Image Process., vol. 26, no. 5, pp. 2545–2560, 2017.
 [62] C.A. Deledalle, J. Salmon, A. S. Dalalyan et al., “Image denoising with patch based pca: local versus global.” in BMVC, vol. 81, 2011, pp. 425–455.
 [63] Y. Tarabalka, M. Fauvel, J. Chanussot, and J. A. Benediktsson, “Svmand mrfbased method for accurate classification of hyperspectral images,” IEEE Geosci. Remote Sens. Lett., vol. 7, no. 4, pp. 736–740, 2010.
 [64] G. CampsValls, L. GomezChova, J. MuñozMarí, J. VilaFrancés, and J. CalpeMaravilla, “Composite kernels for hyperspectral image classification,” IEEE Geosci. Remote Sens. Lett., vol. 3, no. 1, pp. 93–97, 2006.
Comments
There are no comments yet.