Regularized Two-Branch Proposal Networks for Weakly-Supervised Moment Retrieval in Videos

08/19/2020
by   Zhu Zhang, et al.
0

Video moment retrieval aims to localize the target moment in an video according to the given sentence. The weak-supervised setting only provides the video-level sentence annotations during training. Most existing weak-supervised methods apply a MIL-based framework to develop inter-sample confrontment, but ignore the intra-sample confrontment between moments with semantically similar contents. Thus, these methods fail to distinguish the target moment from plausible negative moments. In this paper, we propose a novel Regularized Two-Branch Proposal Network to simultaneously consider the inter-sample and intra-sample confrontments. Concretely, we first devise a language-aware filter to generate an enhanced video stream and a suppressed video stream. We then design the sharable two-branch proposal module to generate positive proposals from the enhanced stream and plausible negative proposals from the suppressed one for sufficient confrontment. Further, we apply the proposal regularization to stabilize the training process and improve model performance. The extensive experiments show the effectiveness of our method. Our code is released at here.

READ FULL TEXT
research
11/19/2019

Weakly-Supervised Video Moment Retrieval via Semantic Completion Network

Video moment retrieval is to search the moment that is most relevant to ...
research
03/16/2020

Weakly-Supervised Multi-Level Attentional Reconstruction Network for Grounding Textual Queries in Videos

The task of temporally grounding textual queries in videos is to localiz...
research
08/24/2020

VLANet: Video-Language Alignment Network for Weakly-Supervised Video Moment Retrieval

Video Moment Retrieval (VMR) is a task to localize the temporal moment i...
research
09/22/2021

Natural Language Video Localization with Learnable Moment Proposals

Given an untrimmed video and a natural language query, Natural Language ...
research
01/20/2021

Online Active Proposal Set Generation for Weakly Supervised Object Detection

To reduce the manpower consumption on box-level annotations, many weakly...
research
06/25/2021

Video Moment Retrieval with Text Query Considering Many-to-Many Correspondence Using Potentially Relevant Pair

In this paper we undertake the task of text-based video moment retrieval...
research
03/12/2023

Towards Diverse Temporal Grounding under Single Positive Labels

Temporal grounding aims to retrieve moments of the described event withi...

Please sign up or login with your details

Forgot password? Click here to reset