Text-based Localization of Moments in a Video Corpus

08/20/2020
by   Sudipta Paul, et al.
0

Prior works on text-based video moment localization focus on temporally grounding the textual query in an untrimmed video. These works assume that the relevant video is already known and attempt to localize the moment on that relevant video only. Different from such works, we relax this assumption and address the task of localizing moments in a corpus of videos for a given sentence query. This task poses a unique challenge as the system is required to perform: (i) retrieval of the relevant video where only a segment of the video corresponds with the queried sentence, and (ii) temporal localization of moment in the relevant video based on sentence query. Towards overcoming this challenge, we propose Hierarchical Moment Alignment Network (HMAN) which learns an effective joint embedding space for moments and sentences. In addition to learning subtle differences between intra-video moments, HMAN focuses on distinguishing inter-video global semantic concepts based on sentence queries. Qualitative and quantitative results on three benchmark text-based video moment retrieval datasets - Charades-STA, DiDeMo, and ActivityNet Captions - demonstrate that our method achieves promising performance on the proposed task of temporal localization of moments in a corpus of videos.

READ FULL TEXT

page 2

page 4

page 9

research
07/30/2019

Temporal Localization of Moments in Video Collections with Natural Language

In this paper, we introduce the task of retrieving relevant video moment...
research
10/23/2022

Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval

Video corpus moment retrieval (VCMR) is the task to retrieve the most re...
research
04/05/2019

Weakly Supervised Video Moment Retrieval From Text Queries

There have been a few recent methods proposed in text to video moment re...
research
03/29/2023

Hierarchical Video-Moment Retrieval and Step-Captioning

There is growing interest in searching for information from large video ...
research
06/25/2021

Video Moment Retrieval with Text Query Considering Many-to-Many Correspondence Using Potentially Relevant Pair

In this paper we undertake the task of text-based video moment retrieval...
research
10/15/2022

Semantic Video Moments Retrieval at Scale: A New Task and a Baseline

Motivated by the increasing need of saving search effort by obtaining re...
research
10/31/2021

Hierarchical Deep Residual Reasoning for Temporal Moment Localization

Temporal Moment Localization (TML) in untrimmed videos is a challenging ...

Please sign up or login with your details

Forgot password? Click here to reset