Discourse Analysis for Evaluating Coherence in Video Paragraph Captions

01/17/2022
by   Arjun R Akula, et al.
0

Video paragraph captioning is the task of automatically generating a coherent paragraph description of the actions in a video. Previous linguistic studies have demonstrated that coherence of a natural language text is reflected by its discourse structure and relations. However, existing video captioning methods evaluate the coherence of generated paragraphs by comparing them merely against human paragraph annotations and fail to reason about the underlying discourse structure. At UCLA, we are currently exploring a novel discourse based framework to evaluate the coherence of video paragraphs. Central to our approach is the discourse representation of videos, which helps in modeling coherence of paragraphs conditioned on coherence of videos. We also introduce DisNet, a novel dataset containing the proposed visual discourse annotations of 3000 videos and their paragraphs. Our experiment results have shown that the proposed framework evaluates coherence of video paragraphs significantly better than all the baseline methods. We believe that many other multi-discipline Artificial Intelligence problems such as Visual Dialog and Visual Storytelling would also greatly benefit from the proposed visual discourse framework and the DisNet dataset.

READ FULL TEXT
research
03/06/2019

Visual Discourse Parsing

Text-level discourse parsing aims to unmask how two segments (or sentenc...
research
06/05/2016

Neural Net Models for Open-Domain Discourse Coherence

Discourse coherence is strongly associated with text quality, making it ...
research
12/31/2020

Towards Modelling Coherence in Spoken Discourse

While there has been significant progress towards modelling coherence in...
research
09/30/2020

Neural RST-based Evaluation of Discourse Coherence

This paper evaluates the utility of Rhetorical Structure Theory (RST) tr...
research
12/13/2018

Adversarial Inference for Multi-Sentence Video Description

While significant progress has been made in the image captioning task, v...
research
10/26/2022

Investigating the Role of Centering Theory in the Context of Neural Coreference Resolution Systems

Centering theory (CT; Grosz et al., 1995) provides a linguistic analysis...
research
05/11/2020

MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning

Generating multi-sentence descriptions for videos is one of the most cha...

Please sign up or login with your details

Forgot password? Click here to reset