Explore Spatio-temporal Aggregation for Insubstantial Object Detection: Benchmark Dataset and Baseline

06/23/2022
by   Kailai Zhou, et al.
0

We endeavor on a rarely explored task named Insubstantial Object Detection (IOD), which aims to localize the object with following characteristics: (1) amorphous shape with indistinct boundary; (2) similarity to surroundings; (3) absence in color. Accordingly, it is far more challenging to distinguish insubstantial objects in a single static frame and the collaborative representation of spatial and temporal information is crucial. Thus, we construct an IOD-Video dataset comprised of 600 videos (141,017 frames) covering various distances, sizes, visibility, and scenes captured by different spectral ranges. In addition, we develop a spatio-temporal aggregation framework for IOD, in which different backbones are deployed and a spatio-temporal aggregation loss (STAloss) is elaborately designed to leverage the consistency along the time axis. Experiments conducted on IOD-Video dataset demonstrate that spatio-temporal aggregation can significantly improve the performance of IOD. We hope our work will attract further researches into this valuable yet challenging task. The code will be available at: <https://github.com/CalayZhou/IOD-Video>.

READ FULL TEXT

page 1

page 4

page 5

page 8

page 13

page 14

page 15

research
11/14/2022

Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions

Spatio-temporal Human-Object Interaction (ST-HOI) detection aims at dete...
research
06/30/2021

Efficient Spatio-Temporal Recurrent Neural Network for Video Deblurring

Real-time video deblurring still remains a challenging task due to the c...
research
06/17/2022

Video Shadow Detection via Spatio-Temporal Interpolation Consistency Training

It is challenging to annotate large-scale datasets for supervised video ...
research
11/20/2022

MINTIME: Multi-Identity Size-Invariant Video Deepfake Detection

In this paper, we introduce MINTIME, a video deepfake detection approach...
research
12/16/2021

Spatio-Temporal CNN baseline method for the Sports Video Task of MediaEval 2021 benchmark

This paper presents the baseline method proposed for the Sports Video ta...
research
04/19/2021

Writing in The Air: Unconstrained Text Recognition from Finger Movement Using Spatio-Temporal Convolution

In this paper, we introduce a new benchmark dataset for the challenging ...
research
11/10/2020

STCNet: Spatio-Temporal Cross Network for Industrial Smoke Detection

Industrial smoke emissions present a serious threat to natural ecosystem...

Please sign up or login with your details

Forgot password? Click here to reset