FigCaps-HF: A Figure-to-Caption Generative Framework and Benchmark with Human Feedback

07/20/2023
by   Ashish Singh, et al.
0

Captions are crucial for understanding scientific visualizations and documents. Existing captioning methods for scientific figures rely on figure-caption pairs extracted from documents for training, many of which fall short with respect to metrics like helpfulness, explainability, and visual-descriptiveness [15] leading to generated captions being misaligned with reader preferences. To enable the generation of high-quality figure captions, we introduce FigCaps-HF a new framework for figure-caption generation that can incorporate domain expert feedback in generating captions optimized for reader preferences. Our framework comprises of 1) an automatic method for evaluating quality of figure-caption pairs, 2) a novel reinforcement learning with human feedback (RLHF) method to optimize a generative figure-to-caption model for reader preferences. We demonstrate the effectiveness of our simple learning framework by improving performance over standard fine-tuning across different types of models. In particular, when using BLIP as the base model, our RLHF framework achieves a mean gain of 35.7 Meteor, respectively. Finally, we release a large-scale benchmark dataset with human feedback on figure-caption pairs to enable further evaluation and development of RLHF techniques for this problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2021

3M: Multi-style image caption generation using Multi-modality features under Multi-UPDOWN model

In this paper, we build a multi-style generative model for stylish image...
research
02/23/2023

Summaries as Captions: Generating Figure Captions for Scientific Documents with Automated Text Summarization

Effective figure captions are crucial for clear comprehension of scienti...
research
10/22/2021

SciCap: Generating Captions for Scientific Figures

Researchers use figures to communicate rich, complex information in scie...
research
09/08/2019

Quality Estimation for Image Captions Based on Large-scale Human Evaluations

Automatic image captioning has improved significantly in the last few ye...
research
06/12/2023

Scalable 3D Captioning with Pretrained Models

We introduce Cap3D, an automatic approach for generating descriptive tex...
research
09/16/2022

Belief Revision based Caption Re-ranker with Visual Semantic Information

In this work, we focus on improving the captions generated by image-capt...
research
12/24/2020

WEmbSim: A Simple yet Effective Metric for Image Captioning

The area of automatic image caption evaluation is still undergoing inten...

Please sign up or login with your details

Forgot password? Click here to reset