Semantically Sensible Video Captioning (SSVC)

09/15/2020
by   Md. Mushfiqur Rahman, et al.
0

Video captioning, i.e. the task of generating captions from video sequences creates a bridge between the Natural Language Processing and Computer Vision domains of computer science. Generating a semantically accurate description of a video is an arduous task. Considering the complexity of the problem, the results obtained in recent researches are quite outstanding. But still there is plenty of scope for improvement. This paper addresses this scope and proposes a novel solution. Most video captioning models comprise of two sequential/recurrent layers - one as a video-to-context encoder and the other as a context-to-caption decoder. This paper proposes a novel architecture, SSVC (Semantically Sensible Video Captioning) which modifies the context generation mechanism by using two novel approaches - "stacked attention" and "spatial hard pull". For evaluating the proposed architecture, along with the BLEU scoring metric for quantitative analysis, we have used a human evaluation metric for qualitative analysis. This paper refers to this proposed human evaluation metric as the Semantic Sensibility (SS) scoring metric. SS score overcomes the shortcomings of common automated scoring metrics. This paper reports that the use of the aforementioned novelties improves the performance of the state-of-the-art architectures.

READ FULL TEXT

page 4

page 10

page 11

research
04/22/2023

A Review of Deep Learning for Video Captioning

Video captioning (VC) is a fast-moving, cross-disciplinary area of resea...
research
03/06/2023

Models See Hallucinations: Evaluating the Factuality in Video Captioning

Video captioning aims to describe events in a video with natural languag...
research
02/12/2021

Annotation Cleaning for the MSR-Video to Text Dataset

The video captioning task is to describe the video contents with natural...
research
11/27/2019

Non-Autoregressive Video Captioning with Iterative Refinement

Existing state-of-the-art autoregressive video captioning methods (ARVC)...
research
03/26/2023

SEM-POS: Grammatically and Semantically Correct Video Captioning

Generating grammatically and semantically correct captions in video capt...
research
07/25/2021

Boosting Video Captioning with Dynamic Loss Network

Video captioning is one of the challenging problems at the intersection ...

Please sign up or login with your details

Forgot password? Click here to reset