Exploring the Challenges of Open Domain Multi-Document Summarization

12/20/2022
by   John Giorgi, et al.
5

Multi-document summarization (MDS) has traditionally been studied assuming a set of ground-truth topic-related input documents is provided. In practice, the input document set is unlikely to be available a priori and would need to be retrieved based on an information need, a setting we call open-domain MDS. We experiment with current state-of-the-art retrieval and summarization models on several popular MDS datasets extended to the open-domain setting. We find that existing summarizers suffer large reductions in performance when applied as-is to this more realistic task, though training summarizers with retrieved inputs can reduce their sensitivity retrieval errors. To further probe these findings, we conduct perturbation experiments on summarizer inputs to study the impact of different types of document retrieval errors. Based on our results, we provide practical guidelines to help facilitate a shift to open-domain MDS. We release our code and experimental results alongside all data or model artifacts created during our investigation.

READ FULL TEXT

page 8

page 20

research
04/24/2018

Towards a Neural Network Approach to Abstractive Multi-Document Summarization

Till now, neural abstractive summarization methods have achieved great s...
research
03/12/2023

Compressed Heterogeneous Graph for Abstractive Multi-Document Summarization

Multi-document summarization (MDS) aims to generate a summary for a numb...
research
09/16/2023

ODSum: New Benchmarks for Open Domain Multi-Document Summarization

Open-domain Multi-Document Summarization (ODMDS) is a critical tool for ...
research
08/03/2017

Reader-Aware Multi-Document Summarization: An Enhanced Model and The First Dataset

We investigate the problem of reader-aware multi-document summarization ...
research
09/10/2023

Multi-document Summarization: A Comparative Evaluation

This paper is aimed at evaluating state-of-the-art models for Multi-docu...
research
10/18/2018

A Temporally Sensitive Submodularity Framework for Timeline Summarization

Timeline summarization (TLS) creates an overview of long-running events ...
research
11/19/2022

Combining State-of-the-Art Models with Maximal Marginal Relevance for Few-Shot and Zero-Shot Multi-Document Summarization

In Natural Language Processing, multi-document summarization (MDS) poses...

Please sign up or login with your details

Forgot password? Click here to reset