HRTransNet: HRFormer-Driven Two-Modality Salient Object Detection

01/08/2023
by   Bin Tang, et al.
0

The High-Resolution Transformer (HRFormer) can maintain high-resolution representation and share global receptive fields. It is friendly towards salient object detection (SOD) in which the input and output have the same resolution. However, two critical problems need to be solved for two-modality SOD. One problem is two-modality fusion. The other problem is the HRFormer output's fusion. To address the first problem, a supplementary modality is injected into the primary modality by using global optimization and an attention mechanism to select and purify the modality at the input level. To solve the second problem, a dual-direction short connection fusion module is used to optimize the output features of HRFormer, thereby enhancing the detailed representation of objects at the output level. The proposed model, named HRTransNet, first introduces an auxiliary stream for feature extraction of supplementary modality. Then, features are injected into the primary modality at the beginning of each multi-resolution branch. Next, HRFormer is applied to achieve forwarding propagation. Finally, all the output features with different resolutions are aggregated by intra-feature and inter-feature interactive transformers. Application of the proposed model results in impressive improvement for driving two-modality SOD tasks, e.g., RGB-D, RGB-T, and light field SOD.https://github.com/liuzywen/HRTransNet

READ FULL TEXT

page 1

page 4

page 8

page 9

page 11

page 12

page 13

research
04/12/2022

SwinNet: Swin Transformer drives edge-aware RGB-D and RGB-T salient object detection

Convolutional neural networks (CNNs) are good at extracting contexture f...
research
10/30/2021

Cross-Modality Fusion Transformer for Multispectral Object Detection

Multispectral image pairs can provide the combined information, making o...
research
04/23/2021

Middle-level Fusion for Lightweight RGB-D Salient Object Detection

Most existing RGB-D salient object detection (SOD) models require large ...
research
03/28/2023

Explicit Attention-Enhanced Fusion for RGB-Thermal Perception Tasks

Recently, RGB-Thermal based perception has shown significant advances. T...
research
12/02/2021

MTFNet: Mutual-Transformer Fusion Network for RGB-D Salient Object Detection

Salient object detection (SOD) on RGB-D images is an active problem in c...
research
06/28/2023

𝐂^2Former: Calibrated and Complementary Transformer for RGB-Infrared Object Detection

Object detection on visible (RGB) and infrared (IR) images, as an emergi...
research
04/06/2023

MemeFier: Dual-stage Modality Fusion for Image Meme Classification

Hate speech is a societal problem that has significantly grown through t...

Please sign up or login with your details

Forgot password? Click here to reset