Evaluating Object Hallucination in Large Vision-Language Models

05/17/2023
by   Yifan Li, et al.
0

Inspired by the superior language abilities of large language models (LLM), large vision-language models (LVLM) have been recently explored by integrating powerful LLMs for improving the performance on complex multimodal tasks. Despite the promising progress on LVLMs, we find that LVLMs suffer from the hallucination problem, i.e. they tend to generate objects that are inconsistent with the target images in the descriptions. To investigate it, this work presents the first systematic study on object hallucination of LVLMs. We conduct the evaluation experiments on several representative LVLMs, and show that they mostly suffer from severe object hallucination issue. We further discuss that the visual instructions may influence the hallucination, and find that: objects that frequently occur in the visual instructions or co-occur with the image objects, are obviously prone to be hallucinated by LVLMs. Besides, we find that existing evaluation methods might be affected by the input instructions and generation styles of LVLMs. Thus, we further design an improved evaluation method for object hallucination by proposing a polling-based query method called POPE. Experiment results demonstrate that our POPE can evaluate the object hallucination in a more stable and flexible way. Our codes and data are publicly available at https://github.com/RUCAIBox/POPE.

READ FULL TEXT

page 2

page 7

research
06/15/2023

LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language Models

Large Vision-Language Models (LVLMs) have recently played a dominant rol...
research
08/31/2023

TouchStone: Evaluating Vision-Language Models by Language Models

Large vision-language models (LVLMs) have recently witnessed rapid advan...
research
08/07/2023

Tiny LVLM-eHub: Early Multimodal Experiments with Bard

Recent advancements in Large Vision-Language Models (LVLMs) have demonst...
research
09/23/2021

Transferring Knowledge from Vision to Language: How to Achieve it and how to Measure it?

Large language models are known to suffer from the hallucination problem...
research
06/07/2021

Playing with words: Do people exploit loaded language to affect others' decisions for their own benefit?

In this article, we study whether people in the position of describing a...
research
05/18/2023

Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model

Foundation models have made significant strides in various applications,...
research
07/15/2021

Multi-Task Learning based Online Dialogic Instruction Detection with Pre-trained Language Models

In this work, we study computational approaches to detect online dialogi...

Please sign up or login with your details

Forgot password? Click here to reset