Analyzing Robustness of End-to-End Neural Models for Automatic Speech Recognition

08/17/2022
by   Goutham Rajendran, et al.
0

We investigate robustness properties of pre-trained neural models for automatic speech recognition. Real life data in machine learning is usually very noisy and almost never clean, which can be attributed to various factors depending on the domain, e.g. outliers, random noise and adversarial noise. Therefore, the models we develop for various tasks should be robust to such kinds of noisy data, which led to the thriving field of robust machine learning. We consider this important issue in the setting of automatic speech recognition. With the increasing popularity of pre-trained models, it's an important question to analyze and understand the robustness of such models to noise. In this work, we perform a robustness analysis of the pre-trained neural models wav2vec2, HuBERT and DistilHuBERT on the LibriSpeech and TIMIT datasets. We use different kinds of noising mechanisms and measure the model performances as quantified by the inference time and the standard Word Error Rate metric. We also do an in-depth layer-wise analysis of the wav2vec2 model when injecting noise in between layers, enabling us to predict at a high level what each layer learns. Finally for this model, we visualize the propagation of errors across the layers and compare how it behaves on clean versus noisy data. Our experiments conform the predictions of Pasad et al. [2021] and also raise interesting directions for future work.

READ FULL TEXT
research
09/13/2017

Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems

Neural models have become ubiquitous in automatic speech recognition sys...
research
01/22/2022

A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition

Wav2vec2.0 is a popular self-supervised pre-training framework for learn...
research
09/18/2023

Training dynamic models using early exits for automatic speech recognition on resource-constrained devices

The possibility of dynamically modifying the computational load of neura...
research
01/05/2019

Improving noise robustness of automatic speech recognition via parallel data and teacher-student learning

For real-world speech recognition applications, noise robustness is stil...
research
01/18/2022

Human and Automatic Speech Recognition Performance on German Oral History Interviews

Automatic speech recognition systems have accomplished remarkable improv...
research
03/12/2023

Improving the Intent Classification accuracy in Noisy Environment

Intent classification is a fundamental task in the spoken language under...
research
10/04/2021

Building a Noisy Audio Dataset to Evaluate Machine Learning Approaches for Automatic Speech Recognition Systems

Automatic speech recognition systems are part of people's daily lives, e...

Please sign up or login with your details

Forgot password? Click here to reset