Cross-lingual Cross-temporal Summarization: Dataset, Models, Evaluation

06/22/2023
by   Ran Zhang, et al.
0

While summarization has been extensively researched in natural language processing (NLP), cross-lingual cross-temporal summarization (CLCTS) is a largely unexplored area that has the potential to improve cross-cultural accessibility, information sharing, and understanding. This paper comprehensively addresses the CLCTS task, including dataset creation, modeling, and evaluation. We build the first CLCTS corpus, leveraging historical fictive texts and Wikipedia summaries in English and German, and examine the effectiveness of popular transformer end-to-end models with different intermediate task finetuning tasks. Additionally, we explore the potential of ChatGPT for CLCTS as a summarizer and an evaluator. Overall, we report evaluations from humans, ChatGPT, and several recent automatic evaluation metrics where we find our intermediate task finetuned end-to-end models generate bad to moderate quality summaries; ChatGPT as a summarizer (without any finetuning) provides moderate to good quality outputs and as an evaluator correlates moderately with human evaluations though it is prone to giving lower scores. ChatGPT also seems to be very adept at normalizing historical text. We finally test ChatGPT in a scenario with adversarially attacked and unseen source documents and find that ChatGPT is better at omission and entity swap than negating against its prior knowledge.

READ FULL TEXT
research
02/19/2022

Models and Datasets for Cross-Lingual Summarisation

We present a cross-lingual summarisation corpus with long documents in a...
research
04/04/2023

SimCSum: Joint Learning of Simplification and Cross-lingual Summarization for Cross-lingual Science Journalism

Cross-lingual science journalism generates popular science stories of sc...
research
10/01/2019

Global Voices: Crossing Borders in Automatic News Summarization

We construct Global Voices, a multilingual dataset for evaluating cross-...
research
12/08/2020

Cross-lingual Approach to Abstractive Summarization

Automatic text summarization extracts important information from texts a...
research
05/30/2022

X-SCITLDR: Cross-Lingual Extreme Summarization of Scholarly Documents

The number of scientific publications nowadays is rapidly increasing, ca...
research
12/01/2022

Long-Document Cross-Lingual Summarization

Cross-Lingual Summarization (CLS) aims at generating summaries in one la...
research
02/11/2022

ClidSum: A Benchmark Dataset for Cross-Lingual Dialogue Summarization

We present ClidSum, a benchmark dataset for building cross-lingual summa...

Please sign up or login with your details

Forgot password? Click here to reset