Evaluation and Analysis of Hallucination in Large Vision-Language Models

08/29/2023
by   Junyang Wang, et al.
0

Large Vision-Language Models (LVLMs) have recently achieved remarkable success. However, LVLMs are still plagued by the hallucination problem, which limits the practicality in many scenarios. Hallucination refers to the information of LVLMs' responses that does not exist in the visual input, which poses potential risks of substantial consequences. There has been limited work studying hallucination evaluation in LVLMs. In this paper, we propose Hallucination Evaluation based on Large Language Models (HaELM), an LLM-based hallucination evaluation framework. HaELM achieves an approximate 95 performance comparable to ChatGPT and has additional advantages including low cost, reproducibility, privacy preservation and local deployment. Leveraging the HaELM, we evaluate the hallucination in current LVLMs. Furthermore, we analyze the factors contributing to hallucination in LVLMs and offer helpful suggestions to mitigate the hallucination problem. Our training data and human annotation hallucination data will be made public soon.

READ FULL TEXT

page 1

page 4

page 5

page 8

research
08/15/2023

The Costly Dilemma: Generalization, Evaluation and Cost-Optimal Deployment of Large Language Models

When deploying machine learning models in production for any product/app...
research
10/31/2022

Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy

Studying data memorization in neural language models helps us understand...
research
08/30/2023

Quantifying and Analyzing Entity-level Memorization in Large Language Models

Large language models (LLMs) have been proven capable of memorizing thei...
research
09/06/2023

HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models

Large Language Models (LLMs) pretrained on massive corpora exhibit remar...
research
09/03/2023

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

While large language models (LLMs) have demonstrated remarkable capabili...
research
04/02/2023

Large Language Models are Few-shot Publication Scoopers

Driven by recent advances AI, we passengers are entering a golden age of...
research
09/14/2023

Assessing the nature of large language models: A caution against anthropocentrism

Generative AI models garnered a large amount of public attention and spe...

Please sign up or login with your details

Forgot password? Click here to reset