Spatio-temporal Video Re-localization by Warp LSTM

05/10/2019
by   Yang Feng, et al.
0

The need for efficiently finding the video content a user wants is increasing because of the erupting of user-generated videos on the Web. Existing keyword-based or content-based video retrieval methods usually determine what occurs in a video but not when and where. In this paper, we make an answer to the question of when and where by formulating a new task, namely spatio-temporal video re-localization. Specifically, given a query video and a reference video, spatio-temporal video re-localization aims to localize tubelets in the reference video such that the tubelets semantically correspond to the query. To accurately localize the desired tubelets in the reference video, we propose a novel warp LSTM network, which propagates the spatio-temporal information for a long period and thereby captures the corresponding long-term dependencies. Another issue for spatio-temporal video re-localization is the lack of properly labeled video datasets. Therefore, we reorganize the videos in the AVA dataset to form a new dataset for spatio-temporal video re-localization research. Extensive experimental results show that the proposed model achieves superior performances over the designed baselines on the spatio-temporal video re-localization task.

READ FULL TEXT

page 7

page 8

research
06/28/2018

Modeling Spatio-Temporal Human Track Structure for Action Localization

This paper addresses spatio-temporal localization of human actions in vi...
research
09/18/2023

Spatio-temporal Co-attention Fusion Network for Video Splicing Localization

Digital video splicing has become easy and ubiquitous. Malicious users c...
research
08/05/2018

Video Re-localization

Many methods have been developed to help people find the video contents ...
research
10/03/2021

Spatio-Temporal Video Representation Learning for AI Based Video Playback Style Prediction

Ever-increasing smartphone-generated video content demands intelligent t...
research
10/17/2016

Spatio-Temporal Attention Models for Grounded Video Captioning

Automatic video captioning is challenging due to the complex interaction...
research
03/23/2023

VADER: Video Alignment Differencing and Retrieval

We propose VADER, a spatio-temporal matching, alignment, and change summ...
research
11/28/2010

Video Stippling

In this paper, we consider rendering color videos using a non-photo-real...

Please sign up or login with your details

Forgot password? Click here to reset