Progressively Normalized Self-Attention Network for Video Polyp Segmentation

05/18/2021
by   Ge-Peng Ji, et al.
14

Existing video polyp segmentation (VPS) models typically employ convolutional neural networks (CNNs) to extract features. However, due to their limited receptive fields, CNNs can not fully exploit the global temporal and spatial information in successive video frames, resulting in false-positive segmentation results. In this paper, we propose the novel PNS-Net (Progressively Normalized Self-attention Network), which can efficiently learn representations from polyp videos with real-time speed ( 140fps) on a single RTX 2080 GPU and no post-processing. Our PNS-Net is based solely on a basic normalized self-attention block, equipping with recurrence and CNNs entirely. Experiments on challenging VPS datasets demonstrate that the proposed PNS-Net achieves state-of-the-art performance. We also conduct extensive experiments to study the effectiveness of the channel split, soft-attention, and progressive learning strategy. We find that our PNS-Net works well under different settings, making it a promising solution to the VPS task.

READ FULL TEXT
research
01/11/2022

TSA-Net: Tube Self-Attention Network for Action Quality Assessment

In recent years, assessing action quality from videos has attracted grow...
research
08/10/2020

MHSA-Net: Multi-Head Self-Attention Network for Occluded Person Re-Identification

This paper presents a novel person re-identification model, named Multi-...
research
03/27/2022

Video Polyp Segmentation: A Deep Learning Perspective

In the deep learning era, we present the first comprehensive video polyp...
research
10/31/2018

Convolutional Self-Attention Network

Self-attention network (SAN) has recently attracted increasing interest ...
research
01/31/2018

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Many natural language processing tasks solely rely on sparse dependencie...
research
01/19/2021

CAA : Channelized Axial Attention for Semantic Segmentation

Self-attention and channel attention, modelling the semantic interdepend...
research
11/04/2020

S3-Net: A Fast and Lightweight Video Scene Understanding Network by Single-shot Segmentation

Real-time understanding in video is crucial in various AI applications s...

Please sign up or login with your details

Forgot password? Click here to reset