Log In Sign Up

Towards Abstractive Grounded Summarization of Podcast Transcripts

by   Kaiqiang Song, et al.

Podcasts have recently shown a rapid rise in popularity. Summarization of podcast transcripts is of practical benefit to both content providers and consumers. It helps consumers to quickly decide whether they will listen to the podcasts and reduces the cognitive load of content providers to write summaries. Nevertheless, podcast summarization faces significant challenges including factual inconsistencies with respect to the inputs. The problem is exacerbated by speech disfluencies and recognition errors in transcripts of spoken language. In this paper, we explore a novel abstractive summarization method to alleviate these challenges. Specifically, our approach learns to produce an abstractive summary while grounding summary segments in specific portions of the transcript to allow for full inspection of summary details. We conduct a series of analyses of the proposed approach on a large podcast dataset and show that the approach can achieve promising results. Grounded summaries bring clear benefits in locating the summary and transcript segments that contain inconsistent information, and hence significantly improve summarization quality in both automatic and human evaluation metrics.


page 1

page 2

page 3

page 4


Asking and Answering Questions to Evaluate the Factual Consistency of Summaries

Practical applications of abstractive summarization models are limited b...

Automatic Summarization of Open-Domain Podcast Episodes

We present implementation details of our abstractive summarizers that ac...

Ontology-Aware Clinical Abstractive Summarization

Automatically generating accurate summaries from clinical reports could ...

Automatic Text Summarization Methods: A Comprehensive Review

One of the most pressing issues that have arisen due to the rapid growth...

A Divide-and-Conquer Approach to the Summarization of Academic Articles

We present a novel divide-and-conquer method for the summarization of lo...

Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary

Recently, there has been growing interest in using question-answering (Q...

Storyboard: Optimizing Precomputed Summaries for Aggregation

An emerging class of data systems partition their data and precompute ap...

Code Repositories


(ACL 2022) The source code for the paper "Towards Abstractive Grounded Summarization of Podcast Transcripts"

view repo