SciCap: Generating Captions for Scientific Figures

10/22/2021
by   Ting-Yao Hsu, et al.
0

Researchers use figures to communicate rich, complex information in scientific papers. The captions of these figures are critical to conveying effective messages. However, low-quality figure captions commonly occur in scientific articles and may decrease understanding. In this paper, we propose an end-to-end neural framework to automatically generate informative, high-quality captions for scientific figures. To this end, we introduce SCICAP, a large-scale figure-caption dataset based on computer science arXiv papers published between 2010 and 2020. After pre-processing - including figure-type classification, sub-figure identification, text normalization, and caption text selection - SCICAP contained more than two million figures extracted from over 290,000 papers. We then established baseline models that caption graph plots, the dominant (19.2 opportunities and steep challenges of generating captions for scientific figures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2023

Summaries as Captions: Generating Figure Captions for Scientific Documents with Automated Text Summarization

Effective figure captions are crucial for clear comprehension of scienti...
research
10/12/2020

MedICaT: A Dataset of Medical Images, Captions, and Textual References

Understanding the relationship between figures and text is key to scient...
research
08/12/2019

Assessing the Quality of Scientific Papers

A multitude of factors are responsible for the overall quality of scient...
research
07/20/2023

FigCaps-HF: A Figure-to-Caption Generative Framework and Benchmark with Human Feedback

Captions are crucial for understanding scientific visualizations and doc...
research
02/01/2021

Metric-Type Identification for Multi-Level Header Numerical Tables in Scientific Papers

Numerical tables are widely used to present experimental results in scie...
research
04/06/2018

Extracting Scientific Figures with Distantly Supervised Neural Networks

Non-textual components such as charts, diagrams and tables provide key i...
research
09/19/2019

Look, Read and Enrich. Learning from Scientific Figures and their Captions

Compared to natural images, understanding scientific figures is particul...

Please sign up or login with your details

Forgot password? Click here to reset