1 Introduction
Many researches have been done on PolSAR image classification, and breakthrough benefits from the development and application of deep convolutional neural networks(DCNN)
[1]. As we all know, PolSAR data are usually expressed by coherent matrices or covariance matrices which contain amplitude and phase information in complex number form. However, a general realvalued DNN loses significant phase information when it is applied to interpret PolSAR data directly. [2]converts a complexvalued coherent or covariance matrix into a normalized 6D realvalued vector for PolSAR data classification, while ignoring important phase information. Different from direct conversion of complex number into a real number, some other strategies are introduced. Besides the coherency matrix extended to the rotation domain, Chen
et al. [3] also take the null angle and rollinvariant polarimetric features as input to extract ample polarimetric features. Liu et al. [4] propose a novel polarimetric scattering coding method for gaining more polarimetric features in classification. However, their operations are all in the real number domain.Instead, in order to make full use of PolSAR data information, some complexvalued DNN models are proposed. Inspired by the application of complexvalued convolutional neural network (CVCNN)[5], Zhang et al[6]
proposed the application of CVCNN on PolSAR data classification and obtained a great success. This is the beginning of CVCNN to classify PolSAR data. Besides retaining information, CVCNN has the strengths of faster learning and converenge
[7]. In addition, deep learning is a datadriven approach. However, the labeled samples are extremely deficient in PolSAR data. Thus, unsupervised or semisupervised networks are used for the classification of PolSAR data, for example, deep convolutional autoencoder
[8]. Meanwhile, GAN[9] is able to expand data. It can learn the potential distribution of actual data and generate fake data that has the same distribution with actual data. With the successful application in many fields (the generation of natural images[10] and Neural Dialogue[11] and so on), the GAN architecture has received increasing attention in recent years. In order to further solve the deficiency of labeled data, it is advisable to combine GAN architecture and semisupervised learning. Therefore, in this paper, we propose a complexvalued GAN framework.Our novel model has three advantages: 1) The complexvalued neural network complies with the physical mechanism of the complex numbers, and it can retain amplitude and phase information of PolSAR data; 2) GAN extended to complex number field can expand PolSAR samples, which have similar distribution with actual samples. Increased samples can improve the classification performance of PolSAR data. 3) Besides labeled data, unlabeled data are also used to update model parameters by semisupervised learning and improve network performance to a certain extent.
2 SemiSupervised ComplexValued Gan
2.1 Network Architecture
The data generated by general realvalued GAN is different from PolSAR data in feature and distribution. Therefore, we extend realvalued GAN to the complex number domain and propose a complexvalued GAN. Figure 1 illustrates the framework of our model, and it is composed by Complexvalued Generator and Complexvalued Discriminator. This framework consists of complexvalued full connection, complexvalued deconvolution, complexvalued convolution, complexvalued activation function and complexvalued batch normalization, which are represented by ”CFC”, ”CDeConv”, ”CConv”, ”CA” and ”CBN”, respectively. In addition, a complexvalued network also makes full use of the amplitude and phase features of PolSAR data.
In the Complexvalued Generator, after a serious of complexvalued operations, two randomly generated vectors shown as the green block and blue block are translated into a complexvalued matrix, which has the same shape and distribution with PolSAR data. In the Complexvalued Discriminator, we use complexvalued operations to extract complete complexvalued features, which are in the form of a pair. Then we concatenate the real part and imaginary part of the last feature to the real domain for final classification. In the training processing, generated fake data, labeled and unlabeled actual data are used to alternately train this complexvalued GAN by semisupervised learning, and until the network can effectively identify the authenticity of input data and achieve correct classification.
2.2 ComplexValued Operation Mask
For simplifying the calculation, we choose the algebraic form to express a complex number. In the algebraic form, the numbers in real part and imaginary part are real numbers with one dimension. We use and to denote two complex numbers, the multiplication and addition are redefined as follows:
(2) 
To indicate the complexvalued operation mentioned in detail, a complexvalued operation mask is proposed, as shown in Figure 2. The green and the blue block represents the real and imaginary part, respectively. This mask can make some complex number calculations, whose input data (, ), the weight (, ) and output data (, ) are consisted of a real part and an imaginary part. Therefore, this type of operation can be decomposed to four traditional real operations, one addition operation and one subtraction operation. Each complexvalued operation in our network complies with this mask. The same expression and physical mechanism of data and network parameters in favor of obtaining full data features used for classification.
2.3 ComplexValued Batch Normalization
Batch normalization has been widely used in deep neural networks for unifying data and accelerate convergence rate. In addition, complexvalued batch normalization can stabilize the performance of GANs. However, scanty training samples and less batch sizes restrict the effect of batch normalization.
In order to address this issue, a novel batch normalization is proposed in this paper. The expectation and covariance matrices are replaced by constantly updated average expectation and covariance matrices, so that they hold all sample information in training proceeding. The following formulation shows the normalization of the tth batch x :
(3) 
where and represent the average expectation and covariance matrix from to batches, which is computed as follows:
(4)  
where denotes the length of state remembered, and is equal to . The square root of a Matrix of 2 times 2 is computed:
(6)  
(7)  
(8) 
This operation can translate the data mean to 0 and variance to 1. Ultimately, we use the following computing to denote complexvalued batch normalization:
(9) 
where and are defined as two parameters to reconstruct the distribution.
2.4 SemiSupervised Learning
In this complexvalued GAN, for further utilizing features of unlabeled data, we use semisupervised learning to optimize network with a classifier of softmax. The output of generator (G) is a dimensional vector , where from to
are the probability of first K classes and
is the probability of input image being fake. In order to optimize the generator (G) and discriminator (D), we define the loss function as follows:
(10)  
(11)  
(12)  
(13) 
where , and represent classification loss of labeled samples, unlabeled samples, and generated samples, respectively. Therefore, classification losses of labeled and generated samples are easily acquired. However, the classification loss of unlabeled samples is not easy to express because of inexplicit ground truth. With this inevitable problem, the output probability of softmax is operated as follows:
(14) 
where denotes the max value in
, and logistic regression as a binary classification is utilized. When the output approaches 1, the probability
accordingly, the facticity of data is discriminated. By this deduction, unlabeled data can also be used to update our network model.3 Experiments
In our experiments, two benchmarks data sets of Flevoland and San Francisco are used. In order to verify the effectiveness of our method, our model is compared with complexvalued convolutional neural network (CVCNN) and realvalued convolutional neural network (RVCNN), they have similar configurations with our Complexvalued Discriminator. The overall accuracy (OA), average accuracy (AA), and Kappa coefficient are used to measure the performance of all the methods.
3.1 Experiments on Standard Data Set
We use a coherent matrix T, which is a conjugate symmetrical complex value matrix and follows complex Wishart distribution, to express all information of the corresponding pixel on PolSAR images. In Flevoland data, 0.2%, 0.5%, 0.8%, 1.0%, 1.2%, 1.5%, 1.8%, 2.0%, 3.0%, 5.0% labeled data in each of 15 categories are randomly selected as training data, and the remained labeled data for testing. In addition, 10% unlabeled samples are used to train our semisupervised complexvalued GANs. In San Francisco data, we randomly chose 10, 20, 30, 50, 80, 100, 120,150, 200, 300 labeled data in each of the 5 categories for training and 10% data, no matter whether labeled, as actual samples.
Flevoland  methods  1  2  3  4  5  6  7  8  9 
RC  87.18  97.85  95.56  94.58  86.72  93.96  98.17  98.89  96.70  
CC  90.79  98.39  95.95  89.71  93.00  93.21  97.46  99.24  97.54  
ours  98.22  99.25  99.29  86.71  95.40  95.27  99.85  99.85  98.59  
methods  10  11  12  13  14  15  OA  AA  Kappa  
RC  94.88  97.70  83.45  95.56  99.00  52.95  95.12  91.54  94.68  
CC  98.02  97.01  91.18  90.48  98.91  65.57  95.12  93.10  94.68  
ours  97.56  97.76  96.07  99.06  100.0  87.38  97.21  96.68  96.97  
San Francisco  methods  1  2  3  4  5  OA  AA  Kappa  
RC  99.16  86.86  59.93  19.29  31.52  74.36  59.35  63.37  
CC  99.07  84.05  53.81  65.51  50.14  80.83  70.51  72.41  
ours  99.45  88.33  86.72  61.91  90.61  89.23  85.41  84.48  
The parameters of all experiments in this paper are set as follows: the patch size is , the learning rate is 0.0005, and the optimization method is Adam with and . Figure 3 and Figure 4 show the change of OA, AA, and Kappa with the sample ratio in two data sets. In Flevoland data, the results verified the superiority of our new network with less labeled samples, and this law especially obvious when training samples less than 3.0%. This same advantage also is shown in San Francisco data, especially if numbers of training data less than 50. In order to exhibit the contributions of our model on each category, we list all test accuracy of Flevoland data with 0.8% sampling ratio and of San Francisco data with 10 labeled training samples in Table 1. In Flevoland data, we can find that accuracies of different categories have generally improved especially for the fifteenth category, which has the least training samples and achieves increase of 65.1% and 33.17% compare to the realvalued and complexvalued neural networks in accuracy, respectively. In San Francisco data, comparing to the complexvalued neural network, complexvalued GAN further improves classification accuracy than the realvalued neural network, especially for Developed, LowDensity Urban and HighDensity Urban with the increase of 44.7%, 220.9%, 187.4%.
3.2 Generated Data Analysis
In order to analyze the effectiveness of our complexvalued GAN, we discuss the similarity of actual and generated data in appearance and distribution. Take Flevoland data for example, we randomly select 100 pcolors of the real part in diagonal elements of T, as shown in Figure 5. We can clearly find that generated data have high similarity with actual data. Based on the known data distribution of T matrix[12], we further count the distribution of actual and generated data in Figure 6. For actual data, the real and imaginary part statistic histograms of shown in (a1) and (a2) and of in (a3) and (a4). (b1)  (b4) represent the corresponding statistic histograms of generated and . We can find the high similarity of generated data with actual data.
4 Conclusion
In this paper, a complexvalued GAN is proposed to classify PolSAR data. Nearly all operations are extended to the complex number field, and this model obeys the physical meaning of PolSAR data and holds complete phase and amplitude feature. To the best of our knowledge, this is the first time that complexvalued data is generated by a network, and the generated data is similar to actual complexvalued data in appearance and distribution The complexvalued GAN is alternately trained with generated data, labeled data and unlabeled data by semisupervised learning. With the utilization of unlabeled and generated samples features, our complexvalued semisupervised GAN obtains obviously precede over other models especially when labeled samples are insufficient. It opens up a new way for our researches on solving the problem of lacking complexvalued samples.
References

[1]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton,
“Imagenet classification with deep convolutional neural networks,”
in NIPS, 2012, pp. 1097–1105.  [2] Yu Zhou, Haipeng Wang, Feng Xu, and YaQiu Jin, “Polarimetric sar image classification using deep convolutional neural networks,” IEEE Geoscience and Remote Sensing Lett., vol. 13, no. 12, pp. 1935–1939, 2016.
 [3] SiWei Chen and ChenSong Tao, “Polsar image classification using polarimetricfeaturedriven deep convolutional neural network,” IEEE Geoscience and Remote Sensing Lett., vol. 15, no. 4, pp. 627–631, 2018.
 [4] Xu Liu, Licheng Jiao, Xu Tang, Qigong Sun, and Dan Zhang, “Polarimetric convolutional network for polsar image classification,” IEEE Trans. Geosci. Remote Sens., 2018.
 [5] Nitzan Guberman, “On complex valued convolutional neural networks,” arXiv preprint arXiv:1602.09046, 2016.
 [6] Zhimian Zhang, Haipeng Wang, Feng Xu, and Ya Qiu Jin, “Complexvalued convolutional neural network and its application in polarimetric sar image classification,” IEEE Trans. Geosci. Remote Sens., vol. PP, no. 99, pp. 1–12, 2017.
 [7] T Nitta, “On the critical points of the complexvalued neural network,” in Neural Information Processing, 2002. ICONIP’02. Proceedings of the 9th International Conference on. IEEE, 2002, vol. 3, pp. 1099–1103.
 [8] Jie Geng, Jianchao Fan, Hongyu Wang, Xiaorui Ma, Baoming Li, and Fuliang Chen, “Highresolution sar image classification via deep convolutional autoencoders,” IEEE Geoscience and Remote Sensing Lett., vol. 12, no. 11, pp. 2351–2355, 2015.
 [9] Ian Goodfellow, Jean PougetAbadie, Mehdi Mirza, Bing Xu, David WardeFarley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, “Generative adversarial nets,” in NIPS, 2014, pp. 2672–2680.
 [10] Alec Radford, Luke Metz, and Soumith Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv preprint:1511.06434, 2015.
 [11] Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, and Dan Jurafsky, “Adversarial learning for neural dialogue generation,” arXiv preprint:1701.06547, 2017.

[12]
Nathaniel R Goodman,
“Statistical analysis based on a certain multivariate complex gaussian distribution (an introduction),”
The Annals of mathematical statistics, vol. 34, no. 1, pp. 152–177, 1963.
Comments
There are no comments yet.