DeepAI AI Chat
Log In Sign Up

Activitynet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos

07/11/2019
by   Shizhe Chen, et al.
Renmin University of China
Microsoft
Carnegie Mellon University
0

Contextual reasoning is essential to understand events in long untrimmed videos. In this work, we systematically explore different captioning models with various contexts for the dense-captioning events in video task, which aims to generate captions for different events in the untrimmed video. We propose five types of contexts as well as two categories of event captioning models, and evaluate their contributions for event captioning from both accuracy and diversity aspects. The proposed captioning models are plugged into our pipeline system for the dense video captioning challenge. The overall system achieves the state-of-the-art performance on the dense-captioning events in video task with 9.91 METEOR score on the challenge testing set.

READ FULL TEXT

page 1

page 2

page 3

page 4

06/22/2018

RUC+CMU: System Report for Dense Captioning Events in Videos

This notebook paper presents our system in the ActivityNet Dense Caption...
05/02/2017

Dense-Captioning Events in Videos

Most natural videos contain numerous events. For example, in a video of ...
05/30/2021

Towards Diverse Paragraph Captioning for Untrimmed Videos

Video paragraph captioning aims to describe multiple events in untrimmed...
11/05/2022

Semantic Metadata Extraction from Dense Video Captioning

Annotation of multimedia data by humans is time-consuming and costly, wh...
11/10/2020

Multimodal Pretraining for Dense Video Captioning

Learning specific hands-on skills such as cooking, car maintenance, and ...
07/24/2022

SAVCHOI: Detecting Suspicious Activities using Dense Video Captioning with Human Object Interactions

Detecting suspicious activities in surveillance videos has been a longst...