CUED_speech at TREC 2020 Podcast Summarisation Track

by   Potsawee Manakul, et al.

In this paper, we describe our approach for the Podcast Summarisation challenge in TREC 2020. Given a podcast episode with its transcription, the goal is to generate a summary that captures the most important information in the content. Our approach consists of two steps: (1) Filtering redundant or less informative sentences in the transcription using the attention of a hierarchical model; (2) Applying a state-of-the-art text summarisation system (BART) fine-tuned on the Podcast data using a sequence-level reward function. Furthermore, we perform ensembles of three and nine models for our submission runs. We also fine-tune the BART model on the Podcast data as our baseline. The human evaluation by NIST shows that our best submission achieves 1.777 in the EGFB scale, while the score of creator-provided description is 1.291. Our system won the Spotify Podcast Summarisation Challenge in the TREC2020 Podcast Track in both human and automatic evaluation.



page 1

page 2

page 3

page 4


Learning to summarize from human feedback

As language models become more powerful, training and evaluation are inc...

Improving BERT with Self-Supervised Attention

One of the most popular paradigms of applying large, pre-trained NLP mod...

RRF102: Meeting the TREC-COVID Challenge with a 100+ Runs Ensemble

In this paper, we report the results of our participation in the TREC-CO...

iTiger: An Automatic Issue Title Generation Tool

In both commercial and open-source software, bug reports or issues are u...

Punctuation restoration in Swedish through fine-tuned KB-BERT

Presented here is a method for automatic punctuation restoration in Swed...

Paraphrase Generation with Deep Reinforcement Learning

Automatic generation of paraphrases for a given sentence is an important...

Generating Coherent and Diverse Slogans with Sequence-to-Sequence Transformer

Previous work in slogan generation focused on generating novel slogans b...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.