ERANNs: Efficient Residual Audio Neural Networks for Audio Pattern Recognition

06/03/2021
by   Sergey Verbitskiy, et al.
0

We present a new architecture of convolutional neural networks (CNNs) based on ResNet for audio pattern recognition tasks. The main modification is introducing a new hyper-parameter for decreasing temporal sizes of tensors with increased stride sizes which we call "the decreasing temporal size parameter". Optimal values of this parameter decrease the number of multi-adds that make the system faster. This approach not only decreases computational complexity but it can save and even increase (for the AudioSet dataset) the performance for audio pattern recognition tasks. This observation can be confirmed by experiments on three datasets: the AudioSet dataset, the ESC-50 dataset, and RAVDESS. Our best system achieves the state-of-the-art performance on the AudioSet dataset with mAP of 0.450. We also transfer a model pre-trained on the AudioSet dataset to the ESC-50 dataset and RAVDESS and obtain the state-of-the-art results with accuracies of 0.961 and 0.748, respectively. We call our system "ERANN" (Efficient Residual Audio Neural Network).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/21/2019

PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition

Audio pattern recognition is an important research topic in the machine ...
research
08/04/2019

Efficient training and design of photonic neural network through neuroevolution

Recently, optical neural networks (ONNs) integrated in photonic chips ha...
research
05/23/2019

The Convolutional Tsetlin Machine

Deep neural networks have obtained astounding successes for important pa...
research
06/15/2023

Audio Tagging on an Embedded Hardware Platform

Convolutional neural networks (CNNs) have exhibited state-of-the-art per...
research
06/18/2017

3D Convolutional Neural Networks for Cross Audio-Visual Matching Recognition

Audio-visual recognition (AVR) has been considered as a solution for spe...
research
05/30/2023

E-PANNs: Sound Recognition Using Efficient Pre-trained Audio Neural Networks

Sounds carry an abundance of information about activities and events in ...
research
05/31/2019

Design Light-weight 3D Convolutional Networks for Video Recognition Temporal Residual, Fully Separable Block, and Fast Algorithm

Deep 3-dimensional (3D) Convolutional Network (ConvNet) has shown promis...

Please sign up or login with your details

Forgot password? Click here to reset