Multi-shot Temporal Event Localization: a Benchmark

12/17/2020
by   Xiaolong Liu, et al.
1

Current developments in temporal event or action localization usually target actions captured by a single camera. However, extensive events or actions in the wild may be captured as a sequence of shots by multiple cameras at different positions. In this paper, we propose a new and challenging task called multi-shot temporal event localization, and accordingly, collect a large scale dataset called MUlti-Shot EventS (MUSES). MUSES has 31,477 event instances for a total of 716 video hours. The core nature of MUSES is the frequent shot cuts, for an average of 19 shots per instance and 176 shots per video, which induces large intrainstance variations. Our comprehensive evaluations show that the state-of-the-art method in temporal action localization only achieves an mAP of 13.1 approach for handling the intra-instance variations, which reports an mAP of 18.9 direction, we release the dataset and the project code at https://songbai.site/muses.

READ FULL TEXT

page 1

page 2

page 4

page 6

page 7

research
01/09/2016

Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs

We address temporal action localization in untrimmed long videos. This i...
research
11/30/2020

Video Self-Stitching Graph Network for Temporal Action Localization

Temporal action localization (TAL) in videos is a challenging task, espe...
research
04/12/2018

SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos

In this paper, we introduce SoccerNet, a benchmark for action spotting i...
research
04/06/2021

Few-Shot Transformation of Common Actions into Time and Space

This paper introduces the task of few-shot common action localization in...
research
10/02/2020

AVECL-UMONS database for audio-visual event classification and localization

We introduce the AVECL-UMons dataset for audio-visual event classificati...
research
02/15/2021

RMS-Net: Regression and Masking for Soccer Event Spotting

The recently proposed action spotting task consists in finding the exact...
research
08/10/2022

Automatic Camera Control and Directing with an Ultra-High-Definition Collaborative Recording System

Capturing an event from multiple camera angles can give a viewer the mos...

Please sign up or login with your details

Forgot password? Click here to reset