Evaluation of Automatic Video Captioning Using Direct Assessment

10/29/2017
by   Yvette Graham, et al.
0

We present Direct Assessment, a method for manually assessing the quality of automatically-generated captions for video. Evaluating the accuracy of video captions is particularly difficult because for any given video clip there is no definitive ground truth or correct answer against which to measure. Automatic metrics for comparing automatic video captions against a manual caption such as BLEU and METEOR, drawn from techniques used in evaluating machine translation, were used in the TRECVid video captioning task in 2016 but these are shown to have weaknesses. The work presented here brings human assessment into the evaluation by crowdsourcing how well a caption describes a video. We automatically degrade the quality of some sample captions which are assessed manually and from this we are able to rate the quality of the human assessors, a factor we take into account in the evaluation. Using data from the TRECVid video-to-text task in 2016, we show how our direct assessment method is replicable and robust and should scale to where there many caption-generation techniques to be evaluated.

READ FULL TEXT
research
01/25/2022

BERTHA: Video Captioning Evaluation Via Transfer-Learned Human Assessment

Evaluating video captioning systems is a challenging task as there are m...
research
09/08/2019

Quality Estimation for Image Captions Based on Large-scale Human Evaluations

Automatic image captioning has improved significantly in the last few ye...
research
01/14/2014

An Enhanced Method For Evaluating Automatic Video Summaries

Evaluation of automatic video summaries is a challenging problem. In the...
research
09/14/2023

Measuring the Quality of Text-to-Video Model Outputs: Metrics and Dataset

Evaluating the quality of videos generated from text-to-video (T2V) mode...
research
11/05/2022

Semantic Metadata Extraction from Dense Video Captioning

Annotation of multimedia data by humans is time-consuming and costly, wh...
research
05/11/2022

SubER: A Metric for Automatic Evaluation of Subtitle Quality

This paper addresses the problem of evaluating the quality of automatica...
research
11/14/2022

Is my automatic audio captioning system so bad? spider-max: a metric to consider several caption candidates

Automatic Audio Captioning (AAC) is the task that aims to describe an au...

Please sign up or login with your details

Forgot password? Click here to reset