Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding

08/16/2018
by   Lea Schönherr, et al.
0

Voice interfaces are becoming accepted widely as input methods for a diverse set of devices. This development is driven by rapid improvements in automatic speech recognition (ASR), which now performs on par with human listening in many tasks. These improvements base on an ongoing evolution of DNNs as the computational core of ASR. However, recent research results show that DNNs are vulnerable to adversarial perturbations, which allow attackers to force the transcription into a malicious output. In this paper, we introduce a new type of adversarial examples based on psychoacoustic hiding. Our attack exploits the characteristics of DNN-based ASR systems, where we extend the original analysis procedure by an additional backpropagation step. We use this backpropagation to learn the degrees of freedom for the adversarial perturbation of the input signal, i.e., we apply a psychoacoustic model and manipulate the acoustic signal below the thresholds of human perception. To further minimize the perceptibility of the perturbations, we use forced alignment to find the best fitting temporal alignment between the original audio sample and the malicious target transcription. These extensions allow us to embed an arbitrary audio input with a malicious voice command that is then transcribed by the ASR system, with the audio signal remaining barely distinguishable from the original signal. In an experimental evaluation, we attack the state-of-the-art speech recognition system Kaldi and determine the best performing parameter and analysis setup for different types of input. Our results show that we are successful in up to 98 effort of fewer than two minutes for a ten-second audio file. Based on user studies, we found that none of our target transcriptions were audible to human listeners, who still understand the original speech content with unchanged accuracy.

READ FULL TEXT
research
12/03/2021

Blackbox Untargeted Adversarial Testing of Automatic Speech Recognition Systems

Automatic speech recognition (ASR) systems are prevalent, particularly i...
research
05/09/2019

Universal Adversarial Perturbations for Speech Recognition Systems

In this work, we demonstrate the existence of universal adversarial audi...
research
08/05/2019

Imperio: Robust Over-the-Air Adversarial Examples for Automatic Speech Recognition Systems

Automatic speech recognition (ASR) systems are possible to fool via targ...
research
03/28/2023

TransAudio: Towards the Transferable Adversarial Audio Attack via Learning Contextualized Perturbations

In a transfer-based attack against Automatic Speech Recognition (ASR) sy...
research
01/24/2018

CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition

ASR (automatic speech recognition) systems like Siri, Alexa, Google Voic...
research
08/18/2023

Compensating Removed Frequency Components: Thwarting Voice Spectrum Reduction Attacks

Automatic speech recognition (ASR) provides diverse audio-to-text servic...
research
10/25/2021

Beyond L_p clipping: Equalization-based Psychoacoustic Attacks against ASRs

Automatic Speech Recognition (ASR) systems convert speech into text and ...

Please sign up or login with your details

Forgot password? Click here to reset