BillSum: A Corpus for Automatic Summarization of US Legislation

10/01/2019
by   Anastassia Kornilova, et al.
0

Automatic summarization methods have been studied on a variety of domains, including news and scientific articles. Yet, legislation has not previously been considered for this task, despite US Congress and state governments releasing tens of thousands of bills every year. In this paper, we introduce BillSum, the first dataset for summarization of US Congressional and California state bills (https://github.com/FiscalNote/BillSum). We explain the properties of the dataset that make it more challenging to process than other domains. Then, we benchmark extractive methods that consider neural sentence representations and traditional contextual features. Finally, we demonstrate that models built on Congressional bills can be used to summarize California bills, thus, showing that methods developed on this dataset can transfer to states without human-written summaries.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/19/2020

Dataset for Automatic Summarization of Russian News

Automatic text summarization has been studied in a variety of domains an...
research
05/09/2023

VCSUM: A Versatile Chinese Meeting Summarization Dataset

Compared to news and chat summarization, the development of meeting summ...
research
10/29/2018

Content Selection in Deep Learning Models of Summarization

We carry out experiments with deep learning models of summarization acro...
research
11/06/2020

What's New? Summarizing Contributions in Scientific Literature

With thousands of academic articles shared on a daily basis, it has beco...
research
02/01/2023

HunSum-1: an Abstractive Summarization Dataset for Hungarian

We introduce HunSum-1: a dataset for Hungarian abstractive summarization...
research
10/22/2022

ECTSum: A New Benchmark Dataset For Bullet Point Summarization of Long Earnings Call Transcripts

Despite tremendous progress in automatic summarization, state-of-the-art...
research
08/21/2020

Abstractive Summarization of Spoken and Written Instructions with BERT

Summarization of speech is a difficult problem due to the spontaneity of...

Please sign up or login with your details

Forgot password? Click here to reset