From Sound Representation to Model Robustness

07/27/2020
by   Mohamad Esmaeilpour, et al.
0

In this paper, we demonstrate the extreme vulnerability of a residual deep neural network architecture (ResNet-18) against adversarial attacks in time-frequency representations of audio signals. We evaluate MFCC, short time Fourier transform (STFT), and discrete wavelet transform (DWT) to modulate environmental sound signals in 2D representation spaces. ResNet-18 not only outperforms other dense deep learning classifiers (i.e., GoogLeNet and AlexNet) in terms of recognition accuracy, but also it considerably transfers adversarial examples to other victim classifiers. On the balance of average budgets allocated by adversaries and the cost of the attack, we notice an inverse relationship between high recognition accuracy and model robustness against six strong adversarial attacks. We investigated this relationship to the three 2D representation domains, which are commonly used to represent audio signals, on three benchmarking environmental sound datasets. The experimental results have shown that while the ResNet-18 classifier trained on DWT spectrograms achieves the highest recognition accuracy, attacking this model is relatively more costly for the adversary compared to the MFCC and STFT representations.

READ FULL TEXT

page 1

page 9

research
04/24/2019

A Robust Approach for Securing Audio Classification Against Adversarial Attacks

Adversarial audio attacks can be considered as a small perturbation unpe...
research
10/22/2019

Cross-Representation Transferability of Adversarial Perturbations: From Spectrograms to Audio Waveforms

This paper shows the susceptibility of spectrogram-based audio classifie...
research
06/22/2017

Comparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks

Recent successful applications of convolutional neural networks (CNNs) t...
research
10/26/2019

Detection of Adversarial Attacks and Characterization of Adversarial Subspace

Adversarial attacks have always been a serious threat for any data-drive...
research
04/15/2020

ESResNet: Environmental Sound Classification Based on Visual Domain Models

Environmental Sound Classification (ESC) is an active research area in t...
research
04/08/2019

Unsupervised Feature Learning for Environmental Sound Classification Using Cycle Consistent Generative Adversarial Network

In this paper we propose a novel environmental sound classification appr...
research
04/09/2020

Fast frequency discrimination and phoneme recognition using a biomimetic membrane coupled to a neural network

In the human ear, the basilar membrane plays a central role in sound rec...

Please sign up or login with your details

Forgot password? Click here to reset