StreamHover: Livestream Transcript Summarization and Annotation

09/11/2021
by   Sangwoo Cho, et al.
0

With the explosive growth of livestream broadcasting, there is an urgent need for new summarization technology that enables us to create a preview of streamed content and tap into this wealth of knowledge. However, the problem is nontrivial due to the informal nature of spoken language. Further, there has been a shortage of annotated datasets that are necessary for transcript summarization. In this paper, we present StreamHover, a framework for annotating and summarizing livestream transcripts. With a total of over 500 hours of videos annotated with both extractive and abstractive summaries, our benchmark dataset is significantly larger than currently existing annotated corpora. We explore a neural extractive summarization model that leverages vector-quantized variational autoencoder to learn latent vector representations of spoken utterances and identify salient utterances from the transcripts to form summaries. We show that our model generalizes better and improves performance over strong baselines. The results of this study provide an avenue for future research to improve summarization solutions for efficient browsing of livestreams.

READ FULL TEXT

page 1

page 3

research
06/10/2021

VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and Summarization

Video transcript summarization is a fundamental task for video understan...
research
05/27/2023

MeetingBank: A Benchmark Dataset for Meeting Summarization

As the number of recorded meetings increases, it becomes increasingly im...
research
10/12/2018

IndoSum: A New Benchmark Dataset for Indonesian Text Summarization

Automatic text summarization is generally considered as a challenging ta...
research
11/02/2020

How Domain Terminology Affects Meeting Summarization Performance

Meetings are essential to modern organizations. Numerous meetings are he...
research
02/05/2019

Abstractive Summarization of Spoken and Written Conversation

Nowadays, lots of information is available in form of dialogues. We prop...
research
06/25/2016

Summarizing Decisions in Spoken Meetings

This paper addresses the problem of summarizing decisions in spoken meet...
research
08/21/2020

Abstractive Summarization of Spoken and Written Instructions with BERT

Summarization of speech is a difficult problem due to the spontaneity of...

Please sign up or login with your details

Forgot password? Click here to reset