Combining State-of-the-Art Models with Maximal Marginal Relevance for Few-Shot and Zero-Shot Multi-Document Summarization

11/19/2022
by   David Adams, et al.
0

In Natural Language Processing, multi-document summarization (MDS) poses many challenges to researchers above those posed by single-document summarization (SDS). These challenges include the increased search space and greater potential for the inclusion of redundant information. While advancements in deep learning approaches have led to the development of several advanced language models capable of summarization, the variety of training data specific to the problem of MDS remains relatively limited. Therefore, MDS approaches which require little to no pretraining, known as few-shot or zero-shot applications, respectively, could be beneficial additions to the current set of tools available in summarization. To explore one possible approach, we devise a strategy for combining state-of-the-art models' outputs using maximal marginal relevance (MMR) with a focus on query relevance rather than document diversity. Our MMR-based approach shows improvement over some aspects of the current state-of-the-art results in both few-shot and zero-shot MDS applications while maintaining a state-of-the-art standard of output by all available metrics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/01/2022

Multi-Document Summarization with Centroid-Based Pretraining

In multi-document summarization (MDS), the input is a cluster of documen...
research
09/30/2020

Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning

While neural sequence learning methods have made significant progress in...
research
09/10/2023

Multi-document Summarization: A Comparative Evaluation

This paper is aimed at evaluating state-of-the-art models for Multi-docu...
research
05/08/2023

The Current State of Summarization

With the explosive growth of textual information, summarization systems ...
research
07/09/2022

Few-shot training LLMs for project-specific code-summarization

Very large language models (LLMs), such as GPT-3 and Codex have achieved...
research
12/20/2022

Exploring the Challenges of Open Domain Multi-Document Summarization

Multi-document summarization (MDS) has traditionally been studied assumi...
research
05/15/2023

A Hierarchical Encoding-Decoding Scheme for Abstractive Multi-document Summarization

Pre-trained language models (PLMs) have accomplished impressive achievem...

Please sign up or login with your details

Forgot password? Click here to reset