YONA: You Only Need One Adjacent Reference-frame for Accurate and Fast Video Polyp Detection

06/06/2023
by   Yuncheng Jiang, et al.
0

Accurate polyp detection is essential for assisting clinical rectal cancer diagnoses. Colonoscopy videos contain richer information than still images, making them a valuable resource for deep learning methods. Great efforts have been made to conduct video polyp detection through multi-frame temporal/spatial aggregation. However, unlike common fixed-camera video, the camera-moving scene in colonoscopy videos can cause rapid video jitters, leading to unstable training for existing video detection models. Additionally, the concealed nature of some polyps and the complex background environment further hinder the performance of existing video detectors. In this paper, we propose the YONA (You Only Need one Adjacent Reference-frame) method, an efficient end-to-end training framework for video polyp detection. YONA fully exploits the information of one previous adjacent frame and conducts polyp detection on the current frame without multi-frame collaborations. Specifically, for the foreground, YONA adaptively aligns the current frame's channel activation patterns with its adjacent reference frames according to their foreground similarity. For the background, YONA conducts background dynamic alignment guided by inter-frame difference to eliminate the invalid features produced by drastic spatial jitters. Moreover, YONA applies cross-frame contrastive learning during training, leveraging the ground truth bounding box to improve the model's perception of polyp and background. Quantitative and qualitative experiments on three public challenging benchmarks demonstrate that our proposed YONA outperforms previous state-of-the-art competitors by a large margin in both accuracy and speed.

READ FULL TEXT
research
07/08/2021

Multi-frame Collaboration for Effective Endoscopic Video Polyp Detection via Spatial-Temporal Feature Transformation

Precise localization of polyp is crucial for early cancer screening in g...
research
08/29/2019

Great Ape Detection in Challenging Jungle Camera Trap Footage via Attention-Based Spatial and Temporal Feature Blending

We propose the first multi-frame video object detection framework traine...
research
06/14/2020

Adaptively Meshed Video Stabilization

Video stabilization is essential for improving visual quality of shaky v...
research
12/12/2019

To See in the Dark: N2DGAN for Background Modeling in Nighttime Scene

Due to the deteriorated conditions of lack and uneven lighting, nightti...
research
09/16/2022

A Deep Moving-camera Background Model

In video analysis, background models have many applications such as back...
research
11/20/2018

Learning to Detect Instantaneous Changes with Retrospective Convolution and Static Sample Synthesis

Change detection has been a challenging visual task due to the dynamic n...
research
01/13/2019

Vehicles Detection Based on Background Modeling

Background image subtraction algorithm is a common approach which detect...

Please sign up or login with your details

Forgot password? Click here to reset