QACE: Asking Questions to Evaluate an Image Caption

08/28/2021
by   Hwanhee Lee, et al.
0

In this paper, we propose QACE, a new metric based on Question Answering for Caption Evaluation. QACE generates questions on the evaluated caption and checks its content by asking the questions on either the reference caption or the source image. We first develop QACE-Ref that compares the answers of the evaluated caption to its reference, and report competitive results with the state-of-the-art metrics. To go further, we propose QACE-Img, which asks the questions directly on the image, instead of reference. A Visual-QA system is necessary for QACE-Img. Unfortunately, the standard VQA models are framed as a classification among only a few thousand categories. Instead, we propose Visual-T5, an abstractive VQA system. The resulting metric, QACE-Img is multi-modal, reference-less, and explainable. Our experiments show that QACE-Img compares favorably w.r.t. other reference-less metrics. We will release the pre-trained models to compute QACE.

READ FULL TEXT

page 2

page 7

research
03/28/2017

An Analysis of Visual Question Answering Algorithms

In visual question answering (VQA), an algorithm must answer text-based ...
research
04/15/2021

Data-QuestEval: A Referenceless Metric for Data to Text Semantic Evaluation

In this paper, we explore how QuestEval, which is a Text-vs-Text metric,...
research
11/02/2022

RQUGE: Reference-Free Metric for Evaluating Question Generation by Answering the Question

Existing metrics for evaluating the quality of automatically generated q...
research
10/10/2017

iVQA: Inverse Visual Question Answering

In recent years, visual question answering (VQA) has become topical as a...
research
07/18/2023

Towards a performance analysis on pre-trained Visual Question Answering models for autonomous driving

This short paper presents a preliminary analysis of three popular Visual...
research
07/22/2023

Expert Knowledge-Aware Image Difference Graph Representation Learning for Difference-Aware Medical Visual Question Answering

To contribute to automating the medical vision-language model, we propos...
research
12/02/2022

Evaluation of FEM and MLFEM AI-explainers in Image Classification tasks with reference-based and no-reference metrics

The most popular methods and algorithms for AI are, for the vast majorit...

Please sign up or login with your details

Forgot password? Click here to reset