Polarimetric synthetic aperture radar (PolSAR) images collected with airborne and satellite sensors are a wealthy source of information concerning the Earth’s surface, and have been widely used in urban planning, agriculture assessment and environment monitoring [wang2017comparison, voormansik2016observations]. These applications require the fully understanding and interpretation of PolSAR images.
Hence, PolSAR image interpretation is of much significance in theory and application. Land use classification of PolSAR images are an important and indispensable research topic since these images contain rich character of the target (e.g., scattering properties, geometric shapes, and the direction of arrival). The land use classification is arranging the pixels to the different categories according to the certain rule. The common objects within the PolSAR images include land, buildings, water, sand, urban areas, vegetation, road, bridge and so on [liuf2016hierarchical]
. In order to distinguish them, the features of the pixels should be fully extracted and mined. With the development of the PolSAR image classification, many feature extraction algorithms based on physical scattering mechanisms have been introduced. The feature extraction techniques based on polarimetric characteristics can be divided into two kinds: coherent target decomposition and incoherent target decomposition. The former acts on the scattering matrix to characterize completely polarized scattered waves, which contains the fully polarimetric information. The latter acts only on the mueller matrix, covariance matrix, or coherency matrix in order to characterize partially polarized waves[dickinson2013classification].
The coherent target decomposition algorithms include the Pauli decomposition, the sphere-diplane-helix (SDH) decomposition [Krogager1990New], the symmetric scattering characterization method (SSCM) [Touzi2002Characterization], Cameron decomposition [Cameron1990Feature], Yamaguchi Four-component scattering power decomposition[Yamaguchi2011Four], General polarimetric model-based decomposition [Chen2013General, Chen2013Adaptive], and some advances [Chen2018Advanced, Mahdianpari2018Fisher]. The incoherent target decomposition algorithms include Huynen decomposition [Huynen1978], Freeman-Durden decomposition [Freeman1993Three], Yamguchi four-component decomposition [Yamaguchi2005Four], Cloude-Pottier decomposition [Cloude1996A], [Cloude1997An], and a number of approaches have been reported[Lee2014Generalized, Besic2014Polarimetric, Aghababaee2016Incoherent]. In addition to feature based on the polarization mechanism [Chen2014Modeling, Chen2014Uniform, Xu2017Polarimetric, Tao2017PolSAR], there are some traditional features of natural images, which have been utilized to analyze PolSAR image, such as color features [Uhlmann2014Integrating], texture features [Grandi2007Target], spatial relations [Ma2014Polarimetric], etc. Based on the above basic features, some multiple features of PolSAR data have been constructed to improve the classification performance [Zou2010Polarimetric, Uhlmann2014Integrating, ren2017unsupervised].
For classification tasks, besides the feature extraction, classifier design is also a key point. According to the degree of data mark, the classification methods can be broadly divided into three groups, including unsupervised classification (without any labeled training data), semi-supervised classification (SSC) (with a small amount of labeled data and a large amount of unlabeled data) and supervised classification (with completely labeled training data) [liu2016semisupervised].
The unsupervised classification approaches design a function to describe hidden structure from unlabeled data. The traditional methods always make a decision rule to cluster PolSAR data into different groups, and the number of groups is also a hyper-parameter. There are a lot of unsupervised classification methods for PolSAR data, such as H/ complex Wishart classifier [Lee2002Unsupervised], polarimetric scattering characteristics preserved method [Lee2004Unsupervised]
, Fuzzy k-means cluster classifier[L1996Fuzzy, Kersten2005Unsupervised]
, the classification based on deep learning[Liu2016POL, Jiao2016Wishart, LCJDeep]
, etc. The SSC is a class of supervised learning tasks and techniques that also make use of a small amount of labeled data and a large amount of unlabeled data for training. Compared with the unsupervised method, SSC can improve classification performance so long as the target is to make full use of a smaller number of labeled samples[Liu2016Large]. There are many semi-supervised classification for PolSAR data, such as the classification based on hypergraph learning [Wei2014PolSAR], the method based on parallel auction graph [Liu2016Fast], spatial-anchor graph [Liu2016Large], etc. Unlike unsupervised approaches and semi-supervised approaches, the supervised classifications use enough labeled samples to train the classifiers which can be applied to determine the class of other samples. Lots of methods have been introduced, including maximum likelihood [Harant2010Fisher]Fukuda2001Polarimetric, Li2008Object, Aghababaee2013Contextual], sparse representation [Zhang2015Fully], deep learning [Li2009Improving, Zhou2016Polarimetric, Liu2017Polarimetric].
Recently, deep learning has attracted considerable attention in the computer vision community[Krizhevsky2012ImageNet, Bengio2009Learning, Yang2015Learning, Zhang2013Tensor], as it provides an efficient way to learn image features and to represent certain function classes far more efficiently than shallow ones [Chen2014Deep], [Han2015Object], [TDC7529190]. Deep learning has also been introduced into the geoscience and remote sensing (RS) community [Liu2016POL], [gong2016change], [Zhao2016Spectral], [Petersson2017Hyperspectral], [Yu2017Convolutional, XuTwo, XL8048556], [lu2017remote, lu2015semi], [yang2018deep]. Especially in the direction of PolSAR image classification, in [Jiao2016Wishart]
, a specific deep model for polarimetric synthetic aperture radar (POLSAR) image classification is proposed, which is named as Wishart deep stacking network (W-DSN). A fast implementation of Wishart distance is achieved by a special linear transformation, which speeds up the classification of POLSAR image. In[Liu2016POL]
, a new type of restricted boltzmann machine (RBM) is specially defined, which we name the Wishart-Bernoulli RBM (WBRBM), and is used to form a deep network named as Wishart Deep Belief Networks (W-DBN). In[xie2017polsar]
, a new type of autoencoder (AE) and convolutional autoencoder (CAE) is specially defined, which we name them Wishart-AE (WAE) and Wishart-CAE (WCAE). In[zhang2017complex], a complex-valued CNN (CV-CNN) specifically for synthetic aperture radar (SAR) image interpretation. It utilizes both amplitude and phase information of complex SAR imagery. In [Chen2018PolSAR], Si-Wei Chen, Xue-Song Wang and Motoyuki Sato proposed a polarimetric-feature-driven deep convolutional neural network (PFDCN) for PolSAR image classification. The core idea of which is to incorporate expert knowledge of target scattering mechanism interpretation and polarimetric feature mining to assist deep CNN classifier training and improve the final classification performance.
What is more, fullly convolutional network (FCN) is successfully used for natural image semantic segmentation [Badrinarayanan2017SegNet, Audebert2016Semantic, shelhamer2017fully, siam2018rtseg, tsai2018learning, liang2018dynamic, chen2018deeplab] and remote sense image classification based on one by one pixel [isikdogan2017surface, cheng2017automatic, jiao2017deep, volpi2017dense]. In [isikdogan2017surface], a fully convolutional neural network is trained to segment water on Landsat imagery. In [cheng2017automatic], a novel deep model, i.e.,a cascaded end-to-end convolutional neural network (CasNet), was proposed to simultaneously cope with the road detection and centerline extraction tasks. Specifically, CasNet consists of two networks. One aims at the road detection task. The other is cascaded to the former one, making full use of the feature maps produced formerly, to obtain the good centerline extraction. In [jiao2017deep], a novel hyperspectral image classification (HSIC) framework, named deep multiscale spatial-spectral feature extraction algorithm, was proposed based on fully convolutional neural network. In [volpi2017dense], the authors presented a CNN-based system relying on a downsample-then-upsample architecture. Specifically, it first learns a rough spatial map of high-level representations by means of convolutions and then learns to upsample them back to the original resolution by deconvolution. By doing so, the CNN learns to densely label every pixel at the original resolution of the image.
Inspired by the previous research works, a supervised PolSAR image classification method based on polarimetric scattering coding and convolution network is proposed in this paper. Our goal is to solve the problems of PolSAR data coding, feature extraction, and land cover classification. Our work can be summarized into three main parts as follows.
Firstly, a new encoding mode of polarimetric scattering matrix is proposed, which is called polarimetric scattering coding. It not only completely preserves the polarization information of data, but also facilitates to extract high-level features by deep learning, especially convolutional networks.
Secondly, a novel PolSAR image classification algorithm based on polarimetric scattering coding and convolutional network is proposed, which is called polarimetric convolutional network and also an end-to-end learning framework.
Thirdly, feature aggregation is designed to fuse the two kinds of feature and mine more advanced features.
The paper is organized as follows. In Section II, the representation of PolSAR images is described. In Section III, the proposed method named polarimetric convolutional network is given. The experimental setting is presented in Section IV. The results are presented in Section V, followed by the conclusions and discussions in Section VI.
Ii Proposed method
In the PolSAR image classification task, the land use classes are determined by different analysis include polarization of the target responses, scattering heterogeneity determination and determination of the polarization state for target discrimination, which need to be decided by different features. It is hard for researchers to consider all kinds of features. PolSAR data is a two-dimensional complex matrix. The traditional feature extraction method represents PolSAR data into a one-dimensional vector, which destroys data space structures. In order to solve this problem, the intuitive way is to express the original data directly. In this paper, the polarimetric scattering coding is proposed to express the original data directly, which can maintain structure information completely. Next, the polarimetric scattering coding matrix obtained by the encoding is fed into a classifier based on fully convolutional network. The following of this section consists of three parts. First, representation of PolSAR images is given. Second, the polarimetric scattering coding for complex scattering matrix is explained. Third, the proposed method called polarimetric convolutional network is presented.
Ii-a Representation of PolSAR images
The fully PolSAR measures the amplitudes and phases of backscattering signals in four combinations: 1) HH; 2) HV; 3) VH; and 4) VV. Where H means horizontal mode, V means vertical mode. These signals form a complex scattering matrix to represent the information for one pixel, which relates the incident and the scattered electric fields. Scattering matrix can be expressed as
where and are the complex scattering coefficients, is the scattering coefficient of the horizontal(H) transmitting and vertical(V) receiving polarization. Other elements have similar definitions. , , , denote the amplitudes of the measured complex scattering coefficients, , , and are the value of phases. is the complex unity.
The characteristics of the target can be specified by vectorizing the scattering matrix. Based on two important basis sets, lexicographic basis and Pauli spin matrix set, in the case of monostatic backscattering with reciprocal medium, the lexicographic scattering vector and Pauli scattering vector are defined as
where superscript denotes the transpose of vector.
The scattering characteristics of a complex target are determined by different independent subscatterers and their interaction, The scattering characteristics described by a statistic method due to the randomness and depolarization. Moreover, the inherent speckle in the SAR data reduced by spatial averaging at the expense of lossing spatial resolution. Therefore, for the complex target, the scattering characteristics should be described by statistic coherence matrix or covariance matrix. Covariance and coherence matrices can be generated from the outer product of and respectively, with its conjugate transpose
where denotes the average value in the data processing stage, and the superscript H stands for the complex conjugate and transpose of vector and matrix.
The covariance matrix C has been proved to follow a complex Wishart distribution [Lee2009Polarimetric]. Moreover, the coherence matrix T is used to express PolSAR data, which has a linear relation with covariance matrix C. The PolSAR features always extracted indirectly from the PolSAR data, such as color features, texture features, and the decomposition features. The color and texture features are extracted from the pseudo-color image which is comprised of the decomposition components. The spatial relation of the pixels could be obtained from such a PolSAR pseudo-color image. The decomposition features are made up of matrix C or T by polarimetric target decompositions, e.g., Pauli decomposition, Cloude decomposition, Freeman-Durden decomposition, etc. A number of works have included the computations of these features as shown in [De2007Target, Uhlmann2014Integrating].
Ii-B Polarimetric scattering coding
Polarimetric data returned by polarimetric synthetic aperture radar is stored in the polarimetric scattering matrix. Polarimetric scattering matrix is used to show polarimetric information, in which the element is complex value. Inspired by the one-hot coding [harris2010digital] and hash coding [leskovec2014mining], we learned from the idea of position encoding and mapping relationship, proposed the polarimetric scattering coding for complex matrix encoding. We assume that is a complex value, and are the real and imaginary parts of respectively. Considering the sign of and , there are four possible for , we give a complete encoding as follows, which is called sparse scattering coding by us.
Fig. 1 and Eq. (6) show the details of polarimetric scattering coding, the first column represents the real element, the second column represents the imaginary part of the element, the first line represents the positive element, and negative elements are expressed in the second row. The is the absolute value operation. An example in Eq. (6) is given as follows
represents the function of polarimetric scattering coding, when , . From Eq. (1), scattering matrix can been written as
Because is a complex matrix, we can write its elements as follows
In order to facilitate the understanding and explanation, we give a general assumption, where , . This assumption can take into account the characteristics of the PolSAR data. For instance, some PolSAR data format is int16 (-32,768 to +32,767). Polarimetric scattering coding of the scattering matrix can be given, which is called polarimetric scattering coding matrix.