Exploring Domain Shift in Extractive Text Summarization

08/30/2019
by   Danqing Wang, et al.
0

Although domain shift has been well explored in many NLP applications, it still has received little attention in the domain of extractive text summarization. As a result, the model is under-utilizing the nature of the training data due to ignoring the difference in the distribution of training sets and shows poor generalization on the unseen domain. With the above limitation in mind, in this paper, we first extend the conventional definition of the domain from categories into data sources for the text summarization task. Then we re-purpose a multi-domain summarization dataset and verify how the gap between different domains influences the performance of neural summarization models. Furthermore, we investigate four learning strategies and examine their abilities to deal with the domain shift problem. Experimental results on three different settings show their different characteristics in our new testbed. Our source code including BERT-based, meta-learning methods for multi-domain summarization learning and the re-purposed dataset Multi-SUM will be available on our project: <http://pfliu.com/TransferSum/>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2020

CDEvalSumm: An Empirical Study of Cross-Dataset Evaluation for Neural Summarization Systems

Neural network-based models augmented with unsupervised pre-trained know...
research
07/03/2023

Challenges in Domain-Specific Abstractive Summarization and How to Overcome them

Large Language Models work quite well with general-purpose data and many...
research
11/16/2020

WikiAsp: A Dataset for Multi-domain Aspect-based Summarization

Aspect-based summarization is the task of generating focused summaries b...
research
07/04/2020

Shape-aware Meta-learning for Generalizing Prostate MRI Segmentation to Unseen Domains

Model generalization capacity at domain shift (e.g., various imaging pro...
research
08/24/2021

Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark

In recent years, deep learning-based methods have shown promising result...
research
05/26/2023

Automated Summarization of Stack Overflow Posts

Software developers often resort to Stack Overflow (SO) to fill their pr...
research
03/16/2023

Exploring Distributional Shifts in Large Language Models for Code Analysis

We systematically study the capacity of two large language models for co...

Please sign up or login with your details

Forgot password? Click here to reset