Effective Audio Classification Network Based on Paired Inverse Pyramid Structure and Dense MLP Block

11/05/2022
by   Yunhao Chen, et al.
0

Recently, massive architectures based on Convolutional Neural Network (CNN) and self-attention mechanisms have become necessary for audio classification. While these techniques are state-of-the-art, these works' effectiveness can only be guaranteed with huge computational costs and parameters, large amounts of data augmentation, transfer from large datasets and some other tricks. By utilizing the lightweight nature of audio, we propose an efficient network structure called Paired Inverse Pyramid Structure (PIP) and a network called Paired Inverse Pyramid Structure MLP Network (PIPMN). The PIPMN reaches 96% of Environmental Sound Classification (ESC) accuracy on the UrbanSound8K dataset and 93.2% of Music Genre Classification (MGC) on the GTAZN dataset, with only 1 million parameters. Both of the results are achieved without data augmentation or model transfer. Public code is available at: https://github.com/JNAIC/PIPMN

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/27/2023

Data Augmentation for Environmental Sound Classification Using Diffusion Probabilistic Model with Top-k Selection Discriminator

Despite consistent advancement in powerful deep learning techniques in r...
research
11/16/2018

AclNet: efficient end-to-end audio classification CNN

We propose an efficient end-to-end convolutional neural network architec...
research
03/30/2018

Parallel Grid Pooling for Data Augmentation

Convolutional neural network (CNN) architectures utilize downsampling la...
research
04/25/2022

End-to-End Audio Strikes Back: Boosting Augmentations Towards An Efficient Audio Classification Network

While efficient architectures and a plethora of augmentations for end-to...
research
06/01/2023

Adapting a ConvNeXt model to audio classification on AudioSet

In computer vision, convolutional neural networks (CNN) such as ConvNeXt...
research
05/23/2023

A Laplacian Pyramid Based Generative H E Stain Augmentation Network

Hematoxylin and Eosin (H E) staining is a widely used sample preparati...

Please sign up or login with your details

Forgot password? Click here to reset