Semantic Labeling in Remote Sensing Corpora Using Feature Fusion-Based Enhanced Global Convolutional Network with High-Resolution Representations and Depthwise Atrous Convolution

04/05/2021
by   Teerapong Panboonyuen, et al.
0

One of the fundamental tasks in remote sensing is the semantic segmentation on the aerial and satellite images. It plays a vital role in applications, such as agriculture planning, map updates, route optimization, and navigation. The state-of-the-art model is the Enhanced Global Convolutional Network (GCN152-TL-A) from our previous work. It composes two main components: (i) the backbone network to extract features and ( ii ) the segmentation network to annotate labels. However, the accuracy can be further improved, since the deep learning network is not designed for recovering low-level features (e.g., river, low vegetation). In this paper, we aim to improve the semantic segmentation network in three aspects, designed explicitly for the remotely sensed domain. First, we propose to employ a modern backbone network called “High-Resolution Representation (HR)” to extract features with higher quality. It repeatedly fuses the representations generated by the high-to-low subnetworks with the restoration of the low-resolution representations to the same depth and level. Second, “Feature Fusion (FF)” is added to our network to capture low-level features (e.g., lines, dots, or gradient orientation). It fuses between the features from the backbone and the segmentation models, which helps to prevent the loss of these low-level features. Finally, “Depthwise Atrous Convolution (DA)” is introduced to refine the extracted features by using four multi-resolution layers in collaboration with a dilated convolution strategy. The experiment was conducted on three data sets: two private corpora from Landsat-8 satellite and one public benchmark from the “ISPRS Vaihingen” challenge. There are two baseline models: the Deep Encoder-Decoder Network (DCED) and our previous model. The results show that the proposed model significantly outperforms all baselines. It is the winner in all data sets and exceeds more than 90% of F1 : 0.9114, 0.9362, and 0.9111 in two Landsat-8 and ISPRS Vaihingen data sets, respectively. Furthermore, it achieves an accuracy beyond 90% on almost all classes.

READ FULL TEXT

page 6

page 8

page 12

page 13

page 17

page 25

page 26

page 27

research
07/26/2020

MACU-Net Semantic Segmentation from High-Resolution Remote Sensing Images

Semantic segmentation of remote sensing images plays an important role i...
research
03/15/2023

HFGD: High-level Feature Guided Decoder for Semantic Segmentation

Commonly used backbones for semantic segmentation, such as ResNet and Sw...
research
09/03/2020

Multi-Attention-Network for Semantic Segmentation of High-Resolution Remote Sensing Images

Semantic segmentation of remote sensing images plays an important role i...
research
09/28/2019

Feature Fusion Detector for Semantic Cognition of Remote Sensing

The value of remote sensing images is of vital importance in many areas ...
research
06/20/2022

Semantic Labeling of High Resolution Images Using EfficientUNets and Transformers

Semantic segmentation necessitates approaches that learn high-level char...
research
05/13/2021

Superevents: Towards Native Semantic Segmentation for Event-based Cameras

Most successful computer vision models transform low-level features, suc...

Please sign up or login with your details

Forgot password? Click here to reset