Excavating RoI Attention for Underwater Object Detection

06/24/2022
by   Xutao Liang, et al.
0

Self-attention is one of the most successful designs in deep learning, which calculates the similarity of different tokens and reconstructs the feature based on the attention matrix. Originally designed for NLP, self-attention is also popular in computer vision, and can be categorized into pixel-level attention and patch-level attention. In object detection, RoI features can be seen as patches from base feature maps. This paper aims to apply the attention module to RoI features to improve performance. Instead of employing an original self-attention module, we choose the external attention module, a modified self-attention with reduced parameters. With the proposed double head structure and the Positional Encoding module, our method can achieve promising performance in object detection. The comprehensive experiments show that it achieves promising performance, especially in the underwater object detection dataset. The code will be avaiable in: https://github.com/zsyasd/Excavating-RoI-Attention-for-Underwater-Object-Detection

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/23/2022

YOLOSA: Object detection based on 2D local feature superimposed self-attention

We analyzed the network structure of real-time object detection models a...
research
12/14/2020

Decoupled Self Attention for Accurate One Stage Object Detection

As the scale of object detection dataset is smaller than that of image r...
research
10/31/2021

DPNET: Dual-Path Network for Efficient Object Detectioj with Lightweight Self-Attention

Object detection often costs a considerable amount of computation to get...
research
08/24/2023

Learning Heavily-Degraded Prior for Underwater Object Detection

Underwater object detection suffers from low detection performance becau...
research
08/08/2022

QSAM-Net: Rain streak removal by quaternion neural network with self-attention module

Images captured in real-world applications in remote sensing, image or v...
research
09/15/2021

FFAVOD: Feature Fusion Architecture for Video Object Detection

A significant amount of redundancy exists between consecutive frames of ...
research
09/15/2023

M^3Net: Multilevel, Mixed and Multistage Attention Network for Salient Object Detection

Most existing salient object detection methods mostly use U-Net or featu...

Please sign up or login with your details

Forgot password? Click here to reset