An Effective Contextual Language Modeling Framework for Speech Summarization with Augmented Features

06/01/2020
by   Shi-Yan Weng, et al.
0

Tremendous amounts of multimedia associated with speech information are driving an urgent need to develop efficient and effective automatic summarization methods. To this end, we have seen rapid progress in applying supervised deep neural network-based methods to extractive speech summarization. More recently, the Bidirectional Encoder Representations from Transformers (BERT) model was proposed and has achieved record-breaking success on many natural language processing (NLP) tasks such as question answering and language understanding. In view of this, we in this paper contextualize and enhance the state-of-the-art BERT-based model for speech summarization, while its contributions are at least three-fold. First, we explore the incorporation of confidence scores into sentence representations to see if such an attempt could help alleviate the negative effects caused by imperfect automatic speech recognition (ASR). Secondly, we also augment the sentence embeddings obtained from BERT with extra structural and linguistic features, such as sentence position and inverse document frequency (IDF) statistics. Finally, we validate the effectiveness of our proposed method on a benchmark dataset, in comparison to several classic and celebrated speech summarization methods.

READ FULL TEXT
research
04/11/2021

Innovative Bert-based Reranking Language Models for Speech Recognition

More recently, Bidirectional Encoder Representations from Transformers (...
research
10/25/2019

L2RS: A Learning-to-Rescore Mechanism for Automatic Speech Recognition

Modern Automatic Speech Recognition (ASR) systems primarily rely on scor...
research
11/16/2021

Attention-based Multi-hypothesis Fusion for Speech Summarization

Speech summarization, which generates a text summary from speech, can be...
research
07/29/2021

Adapting GPT, GPT-2 and BERT Language Models for Speech Recognition

Language models (LMs) pre-trained on massive amounts of text, in particu...
research
07/16/2020

Translate Reverberated Speech to Anechoic Ones: Speech Dereverberation with BERT

Single channel speech dereverberation is considered in this work. Inspir...
research
05/25/2020

An Audio-enriched BERT-based Framework for Spoken Multiple-choice Question Answering

In a spoken multiple-choice question answering (SMCQA) task, given a pas...
research
11/22/2016

Learning to Distill: The Essence Vector Modeling Framework

In the context of natural language processing, representation learning h...

Please sign up or login with your details

Forgot password? Click here to reset