BASS: Block-wise Adaptation for Speech Summarization

07/17/2023
by   Roshan Sharma, et al.
0

End-to-end speech summarization has been shown to improve performance over cascade baselines. However, such models are difficult to train on very large inputs (dozens of minutes or hours) owing to compute restrictions and are hence trained with truncated model inputs. Truncation leads to poorer models, and a solution to this problem rests in block-wise modeling, i.e., processing a portion of the input frames at a time. In this paper, we develop a method that allows one to train summarization models on very long sequences in an incremental manner. Speech summarization is realized as a streaming process, where hypothesis summaries are updated every block based on new acoustic information. We devise and test strategies to pass semantic context across the blocks. Experiments on the How2 dataset demonstrate that the proposed block-wise training method improves by 3 points absolute on ROUGE-L over a truncated input baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2021

Speech Summarization using Restricted Self-Attention

Speech summarization is typically performed by using a cascade of speech...
research
06/06/2023

Towards End-to-end Speech-to-text Summarization

Speech-to-text (S2T) summarization is a time-saving technique for filter...
research
08/08/2022

Investigating Efficiently Extending Transformers for Long Input Summarization

While large pretrained Transformer models have proven highly capable at ...
research
05/19/2019

Structured Summarization of Academic Publications

We propose SUSIE, a novel summarization method that can work with state-...
research
06/30/2023

SummQA at MEDIQA-Chat 2023:In-Context Learning with GPT-4 for Medical Summarization

Medical dialogue summarization is challenging due to the unstructured na...
research
04/05/2021

SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition

In the English speech-to-text (STT) machine learning task, acoustic mode...
research
11/14/2020

DebateSum: A large-scale argument mining and summarization dataset

Prior work in Argument Mining frequently alludes to its potential applic...

Please sign up or login with your details

Forgot password? Click here to reset