ChartSumm: A Comprehensive Benchmark for Automatic Chart Summarization of Long and Short Summaries

04/26/2023
by   Raian Rahman, et al.
8

Automatic chart to text summarization is an effective tool for the visually impaired people along with providing precise insights of tabular data in natural language to the user. A large and well-structured dataset is always a key part for data driven models. In this paper, we propose ChartSumm: a large-scale benchmark dataset consisting of a total of 84,363 charts along with their metadata and descriptions covering a wide range of topics and chart types to generate short and long summaries. Extensive experiments with strong baseline models show that even though these models generate fluent and informative summaries by achieving decent scores in various automatic evaluation metrics, they often face issues like suffering from hallucination, missing out important data points, in addition to incorrect explanation of complex trends in the charts. We also investigated the potential of expanding ChartSumm to other languages using automated translation tools. These make our dataset a challenging benchmark for future research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/12/2022

Chart-to-Text: A Large-Scale Benchmark for Chart Summarization

Charts are commonly used for exploring data and communicating insights. ...
research
10/24/2018

A Multilingual Study of Compressive Cross-Language Text Summarization

Cross-Language Text Summarization (CLTS) generates summaries in a langua...
research
10/12/2018

IndoSum: A New Benchmark Dataset for Indonesian Text Summarization

Automatic text summarization is generally considered as a challenging ta...
research
10/24/2022

LANS: Large-scale Arabic News Summarization Corpus

Text summarization has been intensively studied in many languages, and s...
research
05/23/2022

SQuALITY: Building a Long-Document Summarization Dataset the Hard Way

Summarization datasets are often assembled either by scraping naturally ...
research
10/22/2022

ECTSum: A New Benchmark Dataset For Bullet Point Summarization of Long Earnings Call Transcripts

Despite tremendous progress in automatic summarization, state-of-the-art...
research
11/14/2020

DebateSum: A large-scale argument mining and summarization dataset

Prior work in Argument Mining frequently alludes to its potential applic...

Please sign up or login with your details

Forgot password? Click here to reset