SirenAttack: Generating Adversarial Audio for End-to-End Acoustic Systems

01/23/2019
by   Tianyu Du, et al.
0

Despite their immense popularity, deep learning-based acoustic systems are inherently vulnerable to adversarial attacks, wherein maliciously crafted audios trigger target systems to misbehave. In this paper, we present SirenAttack, a new class of attacks to generate adversarial audios. Compared with existing attacks, SirenAttack highlights with a set of significant features: (i) versatile -- it is able to deceive a range of end-to-end acoustic systems under both white-box and black-box settings; (ii) effective -- it is able to generate adversarial audios that can be recognized as specific phrases by target acoustic systems; and (iii) stealthy -- it is able to generate adversarial audios indistinguishable from their benign counterparts to human perception. We empirically evaluate SirenAttack on a set of state-of-the-art deep learning-based acoustic systems (including speech command recognition, speaker recognition and sound event classification), with results showing the versatility, effectiveness, and stealthiness of SirenAttack. For instance, it achieves 99.45 model, while the generated adversarial audios are also misinterpreted by multiple popular ASR platforms, including Google Cloud Speech, Microsoft Bing Voice, and IBM Speech-to-Text. We further evaluate three potential defense methods to mitigate such attacks, including adversarial training, audio downsampling, and moving average filtering, which leads to promising directions for further research.

READ FULL TEXT

page 1

page 8

page 9

page 13

research
03/10/2023

MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems

Automatic speech recognition (ASR) systems based on deep neural networks...
research
03/18/2019

Practical Hidden Voice Attacks against Speech and Speaker Recognition Systems

Voice Processing Systems (VPSes), now widely deployed, have been made si...
research
03/02/2023

Defending against Adversarial Audio via Diffusion Model

Deep learning models have been widely used in commercial acoustic system...
research
12/13/2018

TextBugger: Generating Adversarial Text Against Real-world Applications

Deep Learning-based Text Understanding (DLTU) is the backbone technique ...
research
11/09/2017

Crafting Adversarial Examples For Speech Paralinguistics Applications

Computational paralinguistic analysis is increasingly being used in a wi...
research
10/19/2021

Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information

Adversarial attacks against commercial black-box speech platforms, inclu...
research
06/07/2022

Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition

Speaker recognition systems (SRSs) have recently been shown to be vulner...

Please sign up or login with your details

Forgot password? Click here to reset