VCSUM: A Versatile Chinese Meeting Summarization Dataset

05/09/2023
by   Han Wu, et al.
0

Compared to news and chat summarization, the development of meeting summarization is hugely decelerated by the limited data. To this end, we introduce a versatile Chinese meeting summarization dataset, dubbed VCSum, consisting of 239 real-life meetings, with a total duration of over 230 hours. We claim our dataset is versatile because we provide the annotations of topic segmentation, headlines, segmentation summaries, overall meeting summaries, and salient sentences for each meeting transcript. As such, the dataset can adapt to various summarization tasks or methods, including segmentation-based summarization, multi-granularity summarization and retrieval-then-generate summarization. Our analysis confirms the effectiveness and robustness of VCSum. We also provide a set of benchmark models regarding different downstream summarization tasks on VCSum to facilitate further research. The dataset and code will be released at https://github.com/hahahawu/VCSum.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/01/2019

BillSum: A Corpus for Automatic Summarization of US Legislation

Automatic summarization methods have been studied on a variety of domain...
research
10/21/2021

CNewSum: A Large-scale Chinese News Summarization Dataset with Human-annotated Adequacy and Deducibility Level

Automatic text summarization aims to produce a brief but crucial summary...
research
12/14/2021

Exploring Neural Models for Query-Focused Summarization

Query-focused summarization (QFS) aims to produce summaries that answer ...
research
06/19/2015

LCSTS: A Large Scale Chinese Short Text Summarization Dataset

Automatic text summarization is widely regarded as the highly difficult ...
research
08/29/2021

SummerTime: Text Summarization Toolkit for Non-experts

Recent advances in summarization provide models that can generate summar...
research
03/15/2022

Differentiable Multi-Agent Actor-Critic for Multi-Step Radiology Report Summarization

The IMPRESSIONS section of a radiology report about an imaging study is ...
research
05/27/2023

MeetingBank: A Benchmark Dataset for Meeting Summarization

As the number of recorded meetings increases, it becomes increasingly im...

Please sign up or login with your details

Forgot password? Click here to reset