Reference and Document Aware Semantic Evaluation Methods for Korean Language Summarization

04/29/2020
by   Dongyub Lee, et al.
1

Text summarization refers to the process that generates a shorter form of text from the source document preserving salient information. Recently, many models for text summarization have been proposed. Most of those models were evaluated using recall-oriented understudy for gisting evaluation (ROUGE) scores. However, as ROUGE scores are computed based on n-gram overlap, they do not reflect semantic meaning correspondences between generated and reference summaries. Because Korean is an agglutinative language that combines various morphemes into a word that express several meanings, ROUGE is not suitable for Korean summarization. In this paper, we propose evaluation metrics that reflect semantic meanings of a reference summary and the original document, Reference and Document Aware Semantic Score (RDASS). We then propose a method for improving the correlation of the metrics with human judgment. Evaluation results show that the correlation with human judgment is significantly higher for our evaluation metrics than for ROUGE scores.

READ FULL TEXT
research
04/21/2022

Spurious Correlations in Reference-Free Evaluation of Text Generation

Model-based, reference-free evaluation metrics have been proposed as a f...
research
10/27/2022

Improving abstractive summarization with energy-based re-ranking

Current abstractive summarization systems present important weaknesses w...
research
10/23/2020

Understanding the Extent to which Summarization Evaluation Metrics Measure the Information Quality of Summaries

Reference-based metrics such as ROUGE or BERTScore evaluate the content ...
research
03/23/2021

SAFEval: Summarization Asks for Fact-based Evaluation

Summarization evaluation remains an open research problem: current metri...
research
06/04/2019

HighRES: Highlight-based Reference-less Evaluation of Summarization

There has been substantial progress in summarization research enabled by...
research
01/25/2022

The Text Anonymization Benchmark (TAB): A Dedicated Corpus and Evaluation Framework for Text Anonymization

We present a novel benchmark and associated evaluation metrics for asses...
research
06/26/2021

A Training-free and Reference-free Summarization Evaluation Metric via Centrality-weighted Relevance and Self-referenced Redundancy

In recent years, reference-based and supervised summarization evaluation...

Please sign up or login with your details

Forgot password? Click here to reset