Measuring the Quality of Text-to-Video Model Outputs: Metrics and Dataset

09/14/2023
by   Iya Chivileva, et al.
0

Evaluating the quality of videos generated from text-to-video (T2V) models is important if they are to produce plausible outputs that convince a viewer of their authenticity. We examine some of the metrics used in this area and highlight their limitations. The paper presents a dataset of more than 1,000 generated videos from 5 very recent T2V models on which some of those commonly used quality metrics are applied. We also include extensive human quality evaluations on those videos, allowing the relative strengths and weaknesses of metrics, including human assessment, to be compared. The contribution is an assessment of commonly used quality metrics, and a comparison of their performances and the performance of human evaluations on an open dataset of T2V videos. Our conclusion is that naturalness and semantic matching with the text prompt used to generate the T2V output are important but there is no single measure to capture these subtleties in assessing T2V model output.

READ FULL TEXT

page 4

page 5

page 8

research
01/25/2022

BERTHA: Video Captioning Evaluation Via Transfer-Learned Human Assessment

Evaluating video captioning systems is a challenging task as there are m...
research
09/14/2018

On Evaluating Perceptual Quality of Online User-Generated Videos

This paper deals with the issue of the perceptual quality evaluation of ...
research
10/29/2017

Evaluation of Automatic Video Captioning Using Direct Assessment

We present Direct Assessment, a method for manually assessing the qualit...
research
10/22/2022

On the Limitations of Reference-Free Evaluations of Generated Text

There is significant interest in developing evaluation metrics which acc...
research
05/22/2023

Improving Metrics for Speech Translation

We introduce Parallel Paraphrasing (Para_both), an augmentation method f...
research
10/09/2020

Evaluating and Characterizing Human Rationales

Two main approaches for evaluating the quality of machine-generated rati...
research
07/02/2022

FRAME: Evaluating Simulatability Metrics for Free-Text Rationales

Free-text rationales aim to explain neural language model (LM) behavior ...

Please sign up or login with your details

Forgot password? Click here to reset