Evaluating Automatically Generated Phoneme Captions for Images

07/31/2020
by   Justin van der Hout, et al.
3

Image2Speech is the relatively new task of generating a spoken description of an image. This paper presents an investigation into the evaluation of this task. For this, first an Image2Speech system was implemented which generates image captions consisting of phoneme sequences. This system outperformed the original Image2Speech system on the Flickr8k corpus. Subsequently, these phoneme captions were converted into sentences of words. The captions were rated by human evaluators for their goodness of describing the image. Finally, several objective metric scores of the results were correlated with these human ratings. Although BLEU4 does not perfectly correlate with human ratings, it obtained the highest correlation among the investigated metrics, and is the best currently existing metric for the Image2Speech task. Current metrics are limited by the fact that they assume their input to be words. A more appropriate metric for the Image2Speech task should assume its input to be parts of words, i.e. phonemes, instead.

READ FULL TEXT
research
09/04/2019

TIGEr: Text-to-Image Grounding for Image Caption Evaluation

This paper presents a new metric called TIGEr for the automatic evaluati...
research
05/06/2019

Image Captioning with Clause-Focused Metrics in a Multi-Modal Setting for Marketing

Automatically generating descriptive captions for images is a well-resea...
research
07/29/2016

SPICE: Semantic Propositional Image Caption Evaluation

There is considerable interest in the task of automatically generating i...
research
09/28/2021

CIDEr-R: Robust Consensus-based Image Description Evaluation

This paper shows that CIDEr-D, a traditional evaluation metric for image...
research
10/12/2018

Pre-gen metrics: Predicting caption quality metrics without generating captions

Image caption generation systems are typically evaluated against referen...
research
12/24/2020

WEmbSim: A Simple yet Effective Metric for Image Captioning

The area of automatic image caption evaluation is still undergoing inten...
research
10/20/2022

Communication breakdown: On the low mutual intelligibility between human and neural captioning

We compare the 0-shot performance of a neural caption-based image retrie...

Please sign up or login with your details

Forgot password? Click here to reset