MP3 Compression To Diminish Adversarial Noise in End-to-End Speech Recognition

07/25/2020
by   Iustina Andronic, et al.
0

Audio Adversarial Examples (AAE) represent specially created inputs meant to trick Automatic Speech Recognition (ASR) systems into misclassification. The present work proposes MP3 compression as a means to decrease the impact of Adversarial Noise (AN) in audio samples transcribed by ASR systems. To this end, we generated AAEs with the Fast Gradient Sign Method for an end-to-end, hybrid CTC-attention ASR system. Our method is then validated by two objective indicators: (1) Character Error Rates (CER) that measure the speech decoding performance of four ASR models trained on uncompressed, as well as MP3-compressed data sets and (2) Signal-to-Noise Ratio (SNR) estimated for both uncompressed and MP3-compressed AAEs that are reconstructed in the time domain by feature inversion. We found that MP3 compression applied to AAEs indeed reduces the CER when compared to uncompressed AAEs. Moreover, feature-inverted (reconstructed) AAEs had significantly higher SNRs after MP3 compression, indicating that AN was reduced. In contrast to AN, MP3 compression applied to utterances augmented with regular noise resulted in more transcription errors, giving further evidence that MP3 encoding is effective in diminishing only AN.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/21/2020

Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech Recognition

Recent advances in Automatic Speech Recognition (ASR) demonstrated how e...
research
08/06/2020

Iterative Compression of End-to-End ASR Model using AutoML

Increasing demand for on-device Automatic Speech Recognition (ASR) syste...
research
02/17/2022

Mitigating Closed-model Adversarial Examples with Bayesian Neural Modeling for Enhanced End-to-End Speech Recognition

In this work, we aim to enhance the system robustness of end-to-end auto...
research
07/08/2019

ShrinkML: End-to-End ASR Model Compression Using Reinforcement Learning

End-to-end automatic speech recognition (ASR) models are increasingly la...
research
10/26/2022

There is more than one kind of robustness: Fooling Whisper with adversarial examples

Whisper is a recent Automatic Speech Recognition (ASR) model displaying ...
research
05/12/2020

Automatic Estimation of Inteligibility Measure for Consonants in Speech

In this article, we provide a model to estimate a real-valued measure of...
research
07/22/2022

ASR Error Detection via Audio-Transcript entailment

Despite improved performances of the latest Automatic Speech Recognition...

Please sign up or login with your details

Forgot password? Click here to reset