Grounded Objects and Interactions for Video Captioning

11/16/2017
by   Chih-Yao Ma, et al.
0

We address the problem of video captioning by grounding language generation on object interactions in the video. Existing work mostly focuses on overall scene understanding with often limited or no emphasis on object interactions to address the problem of video understanding. In this paper, we propose SINet-Caption that learns to generate captions grounded over higher-order interactions between arbitrary groups of objects for fine-grained video understanding. We discuss the challenges and benefits of such an approach. We further demonstrate state-of-the-art results on the ActivityNet Captions dataset using our model, SINet-Caption based on this approach.

READ FULL TEXT

page 2

page 6

page 7

research
11/16/2017

Attend and Interact: Higher-Order Object Interactions for Video Understanding

Human actions often involve complex interactions across several inter-re...
research
09/07/2023

DetermiNet: A Large-Scale Diagnostic Dataset for Complex Visually-Grounded Referencing using Determiners

State-of-the-art visual grounding models can achieve high detection accu...
research
12/02/2021

Relational Graph Learning for Grounded Video Description Generation

Grounded video description (GVD) encourages captioning models to attend ...
research
03/26/2023

GOAL: A Challenging Knowledge-grounded Video Captioning Benchmark for Real-time Soccer Commentary Generation

Despite the recent emergence of video captioning models, how to generate...
research
06/20/2023

Dense Video Object Captioning from Disjoint Supervision

We propose a new task and model for dense video object captioning – dete...
research
10/17/2016

Spatio-Temporal Attention Models for Grounded Video Captioning

Automatic video captioning is challenging due to the complex interaction...
research
12/17/2018

Grounded Video Description

Video description is one of the most challenging problems in vision and ...

Please sign up or login with your details

Forgot password? Click here to reset