## 1 Introduction

One of the noteworthy remote sensing techniques is hyperspectral imaging. Hyperspectral images provide rich spectral information, because the sensors contain hundreds of spectral channels with higher spectral resolution than multispectral cameras. For example, the images, generating from the Airborne Visible Infra-Red Imaging Spectrometer (AVIRIS), have 224 bands. One of the problems in hyperspectral data is presence of mixed pixels [IJRS12_ReviewSpectralUnmixing]. In the scene, pixels containing a single material are called pure pixels and otherwise they are called mixed pixels [Agg16]. Each pixel is composed of a set of materials called endmembers. The corresponding fraction of an endmember in that pixel is named fractional abundance [Miao07]. Spectral unmixing (SU) methods decompose a reflectance spectrum into a set of endmember spectra [Mei11]

and their abundance fractions. The most common mixing model for hyperspectral data is linear mixing model (LMM) in which it is supposed that the recorded reflectance of a particular pixel is a linear combination of its endmembers. In contrast with nonlinear (intimate) models

[JARS17_Nonlinear], LMM features simplicity, acceptable efficiency and low computational complexity. Because of these features, there exists many works that exploit the LMM to solve unmixing problem. Examples of such works are structured sparse method [ISPRS14_Structured], minimum volume simplex analysis (MVSA) [TGRS15_MVSA] that is a classic unmixing algorithm based on a minimum volume simplex, improved discrete swarm intelligence [JARS16_Swarm], conantroppy maximization using alternating direction method of multipliers (ADMM) [TGRS17_Correntropy]and stacked nonnegative sparse autoencoders

[GRSL18_Stacked]that is a special case of artificial neural network (ANN) and has the ability to extract deep robust features.

Nonnegative matrix factorization (NMF) [Paatero94, Lee99] is one of the practical methods of spectral unmixing, which decomposes the data into two nonnegative matrices. Recently, this basic method was developed by adding constraints, such as the minimum volume constrained NMF (MVC-NMF) method [Miao07], graph regularized NMF (GNMF) [Rajabi13], NMF with local smoothness constraint (NMF-LSC) [Yang15], multilayer NMF (MLNMF) [Rajabi15], region based structured NMF [JSTARS17_RegionBasedStructuredNMF] and NMF based framework for hyperspectral unmixing using prior knowledge (NMFupk) [Optical12_NMFupk]. Sparsity is one of the constraints for improving performance of NMF algorithm that is applied to the NMF cost function using regularizers [Qian11], or using double reweighted sparse regression and total variation [wang17]. Since the number of endmembers present in each mixed pixel is small in comparison with the number of total endmembers, the problem becomes sparse [Irodache10]. -NMF unmixing algorithm is developed by applying regularization term into NMF cost function to enforces the sparsity of endmember abundances [Qian11].

Recently, distributed strategy gained a lot of interest in many areas [Chen14]. Diffusion least mean square (LMS) solution has been used to solve distributed problem [Wen13]

. In a distributed optimization problem, there is a network with three types of structures: 1) a single-task network, that nodes estimate a common unknown and optimum vector, 2) a multitask network, which each node estimate its own optimum vector and 3) a clustered multitask network, which contains clusters that in each of clusters there is a common optimum vector that should be estimated

[Chen14]. In the hyperspectral image, a network of pixels can be considered to model the SU problem as a distributed one [khosh17, JSTARS19_DistributedUnmixing]. Here, we have used clustered multitask network to solve SU problem [ICSPIS18_ClusteredMultitask]. Using clustered multitask network, only the neighborhood information of spectrally similar mixed pixels (those are in the same cluster) will be used in the proposed cost function. In order to generate a clustered network, we used the FCM clustering method on spectral features of hyperspectral data. Then unmixing problem has been solved as a clustered multitask network using information of nodes in neighborhood and clusters in addition to sparsity constraint.This paper is organized as follows. In section 2, we introduce the clustered multitask network and proposed unmixing algorithm. Section 3 includes evaluation criteria, experiments on synthetic and real datasets and comparison with other methods. Section 4 gives conclusions and future work.

## 2 Spectral Unmixing Using Clustered Multitask Network

In this section, a new method that utilizes clustering of pixels and neighborhood information is proposed. First, we will express linear mixing model in subsection 2.1, then the distributed algorithm, cost functions and optimization procedure are formulated in 2.2. Finally, the overall algorithm to solve SU problem is presented in 2.3. The proposed method to solve spectral unmixing problem has been summarized in Figure 1.

### 2.1 Linear Mixing Model

To solve the SU problem, we focus on a simple and applicable model named Linear Mixing Model (LMM). In this model, observations are a linear combination of endmembers and their fractional abundances in each pixel. Mathematically, this model for pixel in the hyperspectral image is described as:

(1) |

where is an observed data vector, is the signature matrix, is the fractional abundance vector and is assumed as a additive noise vector of -th pixel of the image, when , and denote the number of endmembers, bands and pixels, respectively.

In the SU problem, fractional abundance vectors have two constraints in each pixel, abundance sum to one constraint (ASC) and abundance nonnegativity constraint (ANC) [Ma14], which are as follows, for endmembers in a scene.

(2) |

(3) |

where is the fractional abundance of the -th endmember in the -th pixel of the image.

### 2.2 Distributed Cost Functions and Optimization

As explained earlier, we can consider three types of networks containing single task, multitask and clustered multitask networks. First, nodes are considered in a clustered multitask network and a optimum vector at node is estimated. A global cost function using LMS, , defined as follows:

(4) |

where is the expectation operator. Then, to minimize the cost function, the following equation is obtained, using the iterative steepest-descent solution [Cattivelii10]:

(5) |

where is a step-size parameter, and the algorithm make small jumps, using an optimum value of . This optimum value causes stability and depends on the cost function. The algorithm will diverge with a too large value of , and will take a long time to converge with a too small value. is iteration number.

In equation (5) the neighborhood information has not been used yet. In a distributed network, information from neighboring nodes are used to improve accuracy. In this article, we utilize the squared Euclidean distance [Chen14]:

(6) |

And then, the regularizer for sparsity constraint is used [Qian11]:

(7) |

Note that, the solution determined from global cost function, need to have access to information over cluster of the node, but the nodes can be considered to have availability only to information of its neighbors and the nodes of in the same cluster. Thus, for solving this problem, the following local cost function is defined, using LMS and adding the (6) and (7) constraints:

(8) |

where the shows nodes that are in the neighborhood of node , the node that exists in the cluster . denotes a regularization parameter [Chen14], that controls the effect of neighborhood term, is a scalar value that weights the sparsity function [Qian11], and the nonnegative coefficients are normalized spectral similarity which are obtained from correlation of data vectors [Chen14]:

(9) |

where denotes the th band of the hyperspectral image, in this equation.

(10) |

where include neighbors of node except itself, and is computed as [Chen14]:

(11) |

Now, minimizing the cost function of (8), using steepest-descent algorithm, results to:

(12) |

Hence, this recursive equation can be used to update fractional abundance vectors in the SU problem.

### 2.3 Proposed Algorithm

Similar to the NMF algorithm, the least mean square error should be minimized with respect to the signatures and abundances matrices, subject to the non-negativity constraint [Lee01]. So, the following equation is denoted, using matrix notation:

(13) |

where and are the signature and fractional abundances matrices, respectively, and Y denotes the Hyperspectral data matrix. Then, based on described equations of the sparsity constrained distributed unmixing, the neighborhood and sparsity terms are added to (13) as follows:

(14) |

This cost function is minimized with respect to , using multiplicative update rules [Lee01], then recursive equation of signature matrix is obtained as:

(15) |

And the recursive equation of fractional abundances has been obtained already in accordance with (12) as follows:

(16) |

where operator projects vectors onto a simplex, that adopt the ASC and ANC constraints for abundance vectors [Chen11]. To initialize the matrices, random initialization and VCA-FCLS method can be used. Since random values may obtain local optimums, VCA-FCLS initialization method has been used in this paper. Another significant point in implementation of the algorithm is stopping criteria. This approach will be stopped until the maximum number of iteration (), or the following stopping criteria is reached.

(17) |

where and are cost function values for two consecutive iterations and has been set to in our experiments. Now, the proposed approach is summarized in Algorithm 1.

## 3 Experiments and Results

In this section, for quantitative evaluation, two common performance metrics: spectral angle distance (SAD) and abundance angle distance (AAD) [Miao07] are used. They are defined as:

(18) |

(19) |

where is the estimation of spectral signature vectors and is the estimation of fractional abundance vectors.

Another criterion that is used for quantitative evaluation is reeconstruction error (RE) that is defined as follows [GRSL18_Stacked]:

(20) |

where and are the original and reconstructed pixels and is the total number of pixels in the hyperspectral scene.

In this article, the proposed algorithm is applied on synthetic and real datasets. First, the proposed algorithm has been applied on synthetic data. to generate this dataset, six signatures of USGS library [Clark07] have been selected randomly, using a 77 low pass filter and containing no pure pixels [Miao07]. Then, the zero mean Gaussian noise with 5 different levels of SNR has been added to generated data, and performance metrics have been computed by averaging 20 Monte-Carlo runs. In Figure 2, the FCM clustering of this dataset is illustrated. Then, to choose the best number of clusters in our experiments, the SAD performance metric has been evaluated, and then according to Figure 3, the best number of clusters has been set to 6, that is equal to number of endmembers. Also, values of and has been considered equal to 0.02 and 0.1, respectively [Chen14], and , to gain the best results.

Then the proposed algorithm and some other algorithms: VCA-FCLS, NMF, -NMF, distributed unmixing and sparsity constrained distributed unmixing, has been applied on the generated synthetic dataset. The comparison of performance metrics of this six different methods has been shown in Figure 4 (a) and (b), where the metrics of proposed algorithm is star-dashed line and excels other methods. Figure 5 shows the original and estimated spectral signatures for 6 endmembers, when the SNR is set to .

Another experiment has been conducted to show the performance of the proposed algorithm for different number of pixels in the synthetic hyperspectral image. Other parameters have been set like the previous experiments. The RE metric defined in (20) is used in this experiment. Figure 7 illustrates the results of this experiment. As it can be seen in this figure, the proposed algorithm performs better when the size of hyperspectral data grows. This emphasizes that with increase of the training data size, the algorithm converges to a solution more robustly.

In the next two experiments, the proposed algorithm has been applied on real datasets: AVIRIS Cuprite and HYDICE Urban. AVIRIS Cuprite dataset is a hyperspectral data captured by the AVIRIS sensor over Cuprite, Nevada. This sensor covers wavelengths from 0.4 to 2.5 in 224 channels [Green98]. 188 bands of these 224 bands are used in the experiments and the other bands (covering bands 1, 2, 104-113, 148-167, and 221-224) have been removed which are related to water-vapor absorption or low SNR bands. Figure 6 (a) illustrates a pseudo color image of this dataset.

In HYDICE Urban dataset, There are 210 bands, that covers wavelengths from 0.4 to 2.5. After removing water-vapor absorption or low SNR bands (including 1-4, 76, 87, 101-111, 136-153, and 198-210), 162 bands are used in the experiments. There are 4 distinguished materials in HYDICE Urban image: asphalt, roof, tree and grass [He16]. Figure 6 (b) illustrates a pseudo color image of this real dataset.

In Figure 8, the results of FCM clustering on two real datasets is illustrated. The simulation results of spectral signatures have been shown in Figure 9 and 10. Also, performance metric of VCA-FCLS, -NMF, distributed unmixing, sparsity constrained distributed unmixing and proposed method on the real datasets have been compared in Table 1 and 2, the results of proposed algorithm are available in the last column and has the best value. In Figure 11 and 12 abundance fraction maps of the proposed method has been illustrated.

materials | VCA-FCLS | -NMF | GLNMF | TV-RSNMF | Dist. | S. Dist. | Proposed Al. |

Sphene | 0.3091 | 0.2143 | 0.1913 | 0.1583 | 0.1561 | 0.1673 | 0.15742.34 |

Nontronite | 0.2622 | 0.2518 | 0.1842 | 0.1803 | 0.1944 | 0.1743 | 0.17111.89 |

KaolinSmect #1 | 0.2498 | 0.1653 | 0.1638 | 0.1731 | 0.2370 | 0.1741 | 0.17022.04 |

Montmorillonite | 0.2609 | 0.2318 | 0.2184 | 0.2159 | 0.3571 | 0.2103 | 0.22482.54 |

Chalcedony | 0.1934 | 0.1995 | 0.1649 | 0.1588 | 0.1603 | 0.1653 | 0.14372.61 |

KaolinSmect #2 | 0.3258 | 0.2542 | 0.2594 | 0.2576 | 0.2873 | 0.2608 | 0.25961.87 |

Alunite | 0.3601 | 0.3458 | 0.2841 | 0.2551 | 0.3813 | 0.2369 | 0.24172.09 |

Buddingtonite | 0.2402 | 0.1693 | 0.2068 | 0.2034 | 0.2514 | 0.1953 | 0.16431.91 |

Muscovite | 0.3917 | 0.1584 | 0.1471 | 0.1563 | 0.4682 | 0.1537 | 0.15751.54 |

Andradite #1 | 0.2851 | 0.3361 | 0.3148 | 0.2392 | 0.2132 | 0.2425 | 0.23372.67 |

Dumortierite | 0.2311 | 0.2453 | 0.2632 | 0.2686 | 0.3381 | 0.2639 | 0.25192.98 |

Andradite #2 | 0.4492 | 0.3829 | 0.3021 | 0.3136 | 0.3711 | 0.2854 | 0.24723.87 |

rmsSAD | 0.3049 | 0.2562 | 0.2317 | 0.2207 | 0.2998 | 0.2153 | 0.20642.79 |

RE | 0.0055 | 0.0050 | 0.0047 | 0.0038 | 0.0052 | 0.0035 | 0.00290.07 |

The SAD and RE performance metric and their variance (in percent) of seven algorithms on AVIRIS Cuprite dataset, using VCA-FCLS initialization.

materials | VCA-FCLS | -NMF | GLNMF | TV-RSNMF | Dist. | S. Dist. | Proposed |

Roof | 0.4671 | 0.3461 | 0.3486 | 0.3327 | 0.3831 | 0.3294 | 0.32892.56 |

Tree | 0.2711 | 0.1492 | 0.1673 | 0.1572 | 0.2052 | 0.1521 | 0.14963.34 |

Asphalt | 0.3077 | 0.2984 | 0.2096 | 0.2054 | 0.2469 | 0.2118 | 0.21221.96 |

Grass | 0.2089 | 0.1461 | 0.1283 | 0.1249 | 0.1344 | 0.1019 | 0.10142.23 |

rmsSAD | 0.3279 | 0.2512 | 0.2291 | 0.2198 | 0.2588 | 0.2161 | 0.21552.47 |

RE | 0.0134 | 0.0120 | 0.0112 | 0.0108 | 0.0131 | 0.0105 | 0.00960.11 |

## 4 Conclusion

This paper proposed the clustered multitask network scheme to solve SU problem. This new algorithm considered sparsity, clustering and neighborhood information. Simulation results on synthetic and real datasets illustrated preference of the proposed approach in comparison against previously published unmixing methods in terms of SAD, AAD and RE measures. Quantitatively, the proposed method improves RE metric for about 15 percent in AVIRIS Cuprite data experiment. Despite this performance improvement, the algorithm needs FCM clustering as a preprocessing stage that will increase computational complexity and run time of the algorithm. For future works, in this paper the FCM clustering method has been used, however using more efficient clustering methods to improve the clustered network can enhance the results of the proposed method.

Comments

There are no comments yet.