Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization

05/26/2023
by   Rongxin Zhu, et al.
0

A series of datasets and models have been proposed for summaries generated for well-formatted documents such as news articles. Dialogue summaries, however, have been under explored. In this paper, we present the first dataset with fine-grained factual error annotations named DIASUMFACT. We define fine-grained factual error detection as a sentence-level multi-label classification problem, and we evaluate two state-of-the-art (SOTA) models on our dataset. Both models yield sub-optimal results, with a macro-averaged F1 score of around 0.25 over 6 error classes. We further propose an unsupervised model ENDERANKER via candidate ranking using pretrained encoder-decoder models. Our model performs on par with the SOTA models while requiring fewer resources. These observations confirm the challenges in detecting factual errors from dialogue summaries, which call for further studies, for which our dataset and results offer a solid foundation.

READ FULL TEXT

page 5

page 7

page 17

page 19

research
06/08/2023

Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework

Factuality is important to dialogue summarization. Factual error correct...
research
05/23/2023

Interpretable Automatic Fine-grained Inconsistency Detection in Text Summarization

Existing factual consistency evaluation approaches for text summarizatio...
research
04/09/2021

Annotating and Modeling Fine-grained Factuality in Summarization

Recent pre-trained abstractive summarization systems have started to ach...
research
11/22/2022

AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies

Existing approaches built separate classifiers to detect nonsense in dia...
research
08/30/2021

CSDS: A Fine-Grained Chinese Dataset for Customer Service Dialogue Summarization

Dialogue summarization has drawn much attention recently. Especially in ...
research
09/22/2021

FCM: A Fine-grained Comparison Model for Multi-turn Dialogue Reasoning

Despite the success of neural dialogue systems in achieving high perform...
research
06/30/2023

A New Task and Dataset on Detecting Attacks on Human Rights Defenders

The ability to conduct retrospective analyses of attacks on human rights...

Please sign up or login with your details

Forgot password? Click here to reset