Uncovering Hidden Challenges in Query-Based Video Moment Retrieval

09/01/2020
by   Mayu Otani, et al.
4

The query-based moment retrieval is a problem of localising a specific clip from an untrimmed video according a query sentence. This is a challenging task that requires interpretation of both the natural language query and the video content. Like in many other areas in computer vision and machine learning, the progress in query-based moment retrieval is heavily driven by the benchmark datasets and, therefore, their quality has significant impact on the field. In this paper, we present a series of experiments assessing how well the benchmark results reflect the true progress in solving the moment retrieval task. Our results indicate substantial biases in the popular datasets and unexpected behaviour of the state-of-the-art models. Moreover, we present new sanity check experiments and approaches for visualising the results. Finally, we suggest possible directions to improve the temporal sentence grounding in the future. Our code for this paper is available at https://mayu-ot.github.io/hidden-challenges-MR .

READ FULL TEXT

page 1

page 5

page 8

page 10

page 11

page 12

page 17

page 18

research
08/20/2019

Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention

This paper studies the problem of temporal moment localization in a long...
research
10/13/2020

DORi: Discovering Object Relationship for Moment Localization of a Natural-Language Query in Video

This paper studies the task of temporal moment localization in a long un...
research
09/30/2020

Encode the Unseen: Predictive Video Hashing for Scalable Mid-Stream Retrieval

This paper tackles a new problem in computer vision: mid-stream video-to...
research
06/03/2021

Deconfounded Video Moment Retrieval with Causal Intervention

We tackle the task of video moment retrieval (VMR), which aims to locali...
research
08/19/2020

Generating Adjacency Matrix for Video-Query based Video Moment Retrieval

In this paper, we continue our work on Video-Query based Video Moment re...
research
03/24/2023

Query-Dependent Video Representation for Moment Retrieval and Highlight Detection

Recently, video moment retrieval and highlight detection (MR/HD) are bei...
research
01/22/2021

A Closer Look at Temporal Sentence Grounding in Videos: Datasets and Metrics

Despite Temporal Sentence Grounding in Videos (TSGV) has realized impres...

Please sign up or login with your details

Forgot password? Click here to reset