Learning Instance Representation Banks for Aerial Scene Classification

05/27/2022
by   Jingjun Yi, et al.
0

Aerial scenes are more complicated in terms of object distribution and spatial arrangement than natural scenes due to the bird view, and thus remain challenging to learn discriminative scene representation. Recent solutions design local semantic descriptors so that region of interests (RoIs) can be properly highlighted. However, each local descriptor has limited description capability and the overall scene representation remains to be refined. In this paper, we solve this problem by designing a novel representation set named instance representation bank (IRB), which unifies multiple local descriptors under the multiple instance learning (MIL) formulation. This unified framework is not trivial as all the local semantic descriptors can be aligned to the same scene scheme, enhancing the scene representation capability. Specifically, our IRB learning framework consists of a backbone, an instance representation bank, a semantic fusion module and a scene scheme alignment loss function. All the components are organized in an end-to-end manner. Extensive experiments on three aerial scene benchmarks demonstrate that our proposed method outperforms the state-of-the-art approaches by a large margin.

READ FULL TEXT
research
05/06/2022

All Grains, One Scheme (AGOS): Learning Multi-grain Instance Representation for Aerial Scene Classification

Aerial scene classification remains challenging as: 1) the size of key o...
research
08/22/2019

Multiple instance dense connected convolution neural network for aerial image scene classification

With the development of deep learning, many state-of-the-art natural ima...
research
03/29/2022

A Multi-Stage Duplex Fusion ConvNet for Aerial Scene Classification

Existing deep learning based methods effectively prompt the performance ...
research
03/12/2020

End-to-End Learning Local Multi-view Descriptors for 3D Point Clouds

In this work, we propose an end-to-end framework to learn local multi-vi...
research
02/08/2022

A Novel Image Descriptor with Aggregated Semantic Skeleton Representation for Long-term Visual Place Recognition

In a Simultaneous Localization and Mapping (SLAM) system, a loop-closure...
research
11/11/2020

Invariant Deep Compressible Covariance Pooling for Aerial Scene Categorization

Learning discriminative and invariant feature representation is the key ...
research
12/20/2022

MM-3DScene: 3D Scene Understanding by Customizing Masked Modeling with Informative-Preserved Reconstruction and Self-Distilled Consistency

Masked Modeling (MM) has demonstrated widespread success in various visi...

Please sign up or login with your details

Forgot password? Click here to reset