Defending Adversarial Attacks on Cloud-aided Automatic Speech Recognition Systems
With the advancement of deep learning based speech recognition technology, an increasing number of cloud-aided automatic voice as- sistant applications, such as Google Home, Amazon Echo, and cloud AI services, such as IBM Watson, are emerging in our daily life. In a typical usage scenario, after keyword activation, the user’s voice will be recorded and submitted to the cloud for automatic speech recognition (ASR) and then further action(s) might be triggered depending on the user’s command(s). However, recent researches show that the deep learning based systems could be easily attacked by adversarial examples. Subsequently, the ASR systems are found being vulnerable to audio adversarial examples. Unfortunately, very few works about defending audio adversarial attack are known in the literature. Constructing a generic and robust defense mecha- nism to resolve this issue remains an open problem. In this work, we propose several proactive defense mechanisms against targeted audio adversarial examples in the ASR systems via code modula- tion and audio compression. We then show the effectiveness of the proposed strategies through extensive evaluation on natural dataset.
READ FULL TEXT