Large scale evaluation of importance maps in automatic speech recognition

05/21/2020
by   Viet Anh Trinh, et al.
0

In this paper, we propose a metric that we call the structured saliency benchmark (SSBM) to evaluate importance maps computed for automatic speech recognizers on individual utterances. These maps indicate time-frequency points of the utterance that are most important for correct recognition of a target word. Our evaluation technique is not only suitable for standard classification tasks, but is also appropriate for structured prediction tasks like sequence-to-sequence models. Additionally, we use this approach to perform a large scale comparison of the importance maps created by our previously introduced technique using "bubble noise" to identify important points through correlation with a baseline approach based on smoothed speech energy and forced alignment. Our results show that the bubble analysis approach is better at identifying important speech regions than this baseline on 100 sentences from the AMI corpus.

READ FULL TEXT

page 2

page 3

research
11/08/2015

Towards Structured Deep Neural Network for Automatic Speech Recognition

In this paper we propose the Structured Deep Neural Network (structured ...
research
11/13/2018

Corpus Phonetics Tutorial

Corpus phonetics has become an increasingly popular method of research i...
research
10/14/2022

Bringing NURC/SP to Digital Life: the Role of Open-source Automatic Speech Recognition Models

The NURC Project that started in 1969 to study the cultured linguistic u...
research
06/02/2023

Improved Training for End-to-End Streaming Automatic Speech Recognition Model with Punctuation

Punctuated text prediction is crucial for automatic speech recognition a...
research
04/20/2020

ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers

Automatic speech recognition (ASR) via call is essential for various app...
research
02/12/2020

Attentional Speech Recognition Models Misbehave on Out-of-domain Utterances

We discuss the problem of echographic transcription in autoregressive se...
research
07/29/2018

Towards Automatic Speech Identification from Vocal Tract Shape Dynamics in Real-time MRI

Vocal tract configurations play a vital role in generating distinguishab...

Please sign up or login with your details

Forgot password? Click here to reset