Summarization from Leaderboards to Practice: Choosing A Representation Backbone and Ensuring Robustness

06/18/2023
by   David Demeter, et al.
0

Academic literature does not give much guidance on how to build the best possible customer-facing summarization system from existing research components. Here we present analyses to inform the selection of a system backbone from popular models; we find that in both automatic and human evaluation, BART performs better than PEGASUS and T5. We also find that when applied cross-domain, summarizers exhibit considerably worse performance. At the same time, a system fine-tuned on heterogeneous domains performs well on all domains and will be most suitable for a broad-domain summarizer. Our work highlights the need for heterogeneous domain summarization benchmarks. We find considerable variation in system output that can be captured only with human evaluation and are thus unlikely to be reflected in standard leaderboards with only automatic evaluation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2023

USB: A Unified Summarization Benchmark Across Tasks and Domains

An abundance of datasets exist for training and evaluating models on the...
research
04/06/2021

Attention Head Masking for Inference Time Content Selection in Abstractive Summarization

How can we effectively inform content selection in Transformer-based abs...
research
11/04/2022

Evaluating and Improving Factuality in Multimodal Abstractive Summarization

Current metrics for evaluating factuality for abstractive document summa...
research
12/15/2022

Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation

Human evaluation is the foundation upon which the evaluation of both sum...
research
12/20/2022

Transformers Go for the LOLs: Generating (Humourous) Titles from Scientific Abstracts End-to-End

We consider the end-to-end abstract-to-title generation problem, explori...
research
10/15/2020

Compressive Summarization with Plausibility and Salience Modeling

Compressive summarization systems typically rely on a crafted set of syn...
research
11/02/2020

How Domain Terminology Affects Meeting Summarization Performance

Meetings are essential to modern organizations. Numerous meetings are he...

Please sign up or login with your details

Forgot password? Click here to reset