Two-Staged Acoustic Modeling Adaption for Robust Speech Recognition by the Example of German Oral History Interviews

08/19/2019
by   Michael Gref, et al.
0

In automatic speech recognition, often little training data is available for specific challenging tasks, but training of state-of-the-art automatic speech recognition systems requires large amounts of annotated speech. To address this issue, we propose a two-staged approach to acoustic modeling that combines noise and reverberation data augmentation with transfer learning to robustly address challenges such as difficult acoustic recording conditions, spontaneous speech, and speech of elderly people. We evaluate our approach using the example of German oral history interviews, where a relative average reduction of the word error rate by 19.3

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/18/2022

Human and Automatic Speech Recognition Performance on German Oral History Interviews

Automatic speech recognition systems have accomplished remarkable improv...
research
06/06/2023

RescueSpeech: A German Corpus for Speech Recognition in Search and Rescue Domain

Despite recent advancements in speech recognition, there are still diffi...
research
10/04/2021

Building a Noisy Audio Dataset to Evaluate Machine Learning Approaches for Automatic Speech Recognition Systems

Automatic speech recognition systems are part of people's daily lives, e...
research
06/20/2016

A Nonparametric Bayesian Approach for Spoken Term detection by Example Query

State of the art speech recognition systems use data-intensive context-d...
research
10/13/2020

Towards Data-efficient Modeling for Wake Word Spotting

Wake word (WW) spotting is challenging in far-field not only because of ...
research
05/08/2021

Robustness of end-to-end Automatic Speech Recognition Models – A Case Study using Mozilla DeepSpeech

When evaluating the performance of automatic speech recognition models, ...
research
10/22/2020

Rethinking Evaluation in ASR: Are Our Models Robust Enough?

Is pushing numbers on a single benchmark valuable in automatic speech re...

Please sign up or login with your details

Forgot password? Click here to reset