HydraSum – Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models

10/08/2021
by   Tanya Goyal, et al.
0

Existing abstractive summarization models lack explicit control mechanisms that would allow users to influence the stylistic features of the model outputs. This results in generating generic summaries that do not cater to the users needs or preferences. To address this issue we introduce HydraSum, a new summarization architecture that extends the single decoder framework of current models, e.g. BART, to a mixture-of-experts version consisting of multiple decoders. Our proposed model encourages each expert, i.e. decoder, to learn and generate stylistically-distinct summaries along dimensions such as abstractiveness, length, specificity, and others. At each time step, HydraSum employs a gating mechanism that decides the contribution of each individual decoder to the next token's output probability distribution. Through experiments on three summarization datasets (CNN, Newsroom, XSum), we demonstrate that this gating mechanism automatically learns to assign contrasting summary styles to different HydraSum decoders under the standard training objective without the need for additional supervision. We further show that a guided version of the training process can explicitly govern which summary style is partitioned between decoders, e.g. high abstractiveness vs. low abstractiveness or high specificity vs. low specificity, and also increase the stylistic-difference between individual decoders. Finally, our experiments demonstrate that our decoder framework is highly flexible: during inference, we can sample from individual decoders or mixtures of different subsets of the decoders to yield a diverse set of summaries and enforce single- and multi-style control over summary generation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2021

Inference Time Style Control for Summarization

How to generate summaries of different styles without requiring corpora ...
research
12/08/2020

CTRLsum: Towards Generic Controllable Text Summarization

Current summarization systems yield generic summaries that are disconnec...
research
10/15/2020

GSum: A General Framework for Guided Neural Abstractive Summarization

Neural abstractive summarization models are flexible and can produce coh...
research
11/14/2017

Controllable Abstractive Summarization

Current models for document summarization ignore user preferences such a...
research
11/03/2022

Latent Prompt Tuning for Text Summarization

Prompts with different control signals (e.g., length, keywords, etc.) ca...
research
04/05/2021

A New Approach to Overgenerating and Scoring Abstractive Summaries

We propose a new approach to generate multiple variants of the target su...
research
05/27/2022

Guided Exploration of Data Summaries

Data summarization is the process of producing interpretable and represe...

Please sign up or login with your details

Forgot password? Click here to reset