Deep Transformers for Fast Small Intestine Grounding in Capsule Endoscope Video

04/07/2021
by   Xinkai Zhao, et al.
0

Capsule endoscopy is an evolutional technique for examining and diagnosing intractable gastrointestinal diseases. Because of the huge amount of data, analyzing capsule endoscope videos is very time-consuming and labor-intensive for gastrointestinal medicalists. The development of intelligent long video analysis algorithms for regional positioning and analysis of capsule endoscopic video is therefore essential to reduce the workload of clinicians and assist in improving the accuracy of disease diagnosis. In this paper, we propose a deep model to ground shooting range of small intestine from a capsule endoscope video which has duration of tens of hours. This is the first attempt to attack the small intestine grounding task using deep neural network method. We model the task as a 3-way classification problem, in which every video frame is categorized into esophagus/stomach, small intestine or colorectum. To explore long-range temporal dependency, a transformer module is built to fuse features of multiple neighboring frames. Based on the classification model, we devise an efficient search algorithm to efficiently locate the starting and ending shooting boundaries of the small intestine. Without searching the small intestine exhaustively in the full video, our method is implemented via iteratively separating the video segment along the direction to the target boundary in the middle. We collect 113 videos from a local hospital to validate our method. In the 5-fold cross validation, the average IoU between the small intestine segments located by our method and the ground-truths annotated by broad-certificated gastroenterologists reaches 0.945.

READ FULL TEXT

page 1

page 3

page 4

research
10/18/2021

Unsupervised Shot Boundary Detection for Temporal Segmentation of Long Capsule Endoscopy Videos

Physicians use Capsule Endoscopy (CE) as a non-invasive and non-surgical...
research
10/18/2021

Graph Convolution Neural Network For Weakly Supervised Abnormality Localization In Long Capsule Endoscopy Videos

Temporal activity localization in long videos is an important problem. T...
research
08/11/2018

Target Image Video Search Based on Local Features

This paper presents a new search algorithm called Target Image Search ba...
research
03/15/2023

Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos

Video temporal grounding aims to pinpoint a video segment that matches t...
research
07/26/2016

Generic Feature Learning for Wireless Capsule Endoscopy Analysis

The interpretation and analysis of the wireless capsule endoscopy record...
research
10/26/2018

Capsule-Forensics: Using Capsule Networks to Detect Forged Images and Videos

Recent advances in media generation techniques have made it easier for a...
research
05/08/2013

Automated polyp detection in colon capsule endoscopy

Colorectal polyps are important precursors to colon cancer, a major heal...

Please sign up or login with your details

Forgot password? Click here to reset