Investigating Multi-Feature Selection and Ensembling for Audio Classification

by   Muhammad Turab, et al.
NUI Galway
Dublin City University

Deep Learning (DL) algorithms have shown impressive performance in diverse domains. Among them, audio has attracted many researchers over the last couple of decades due to some interesting patterns–particularly in classification of audio data. For better performance of audio classification, feature selection and combination play a key role as they have the potential to make or break the performance of any DL model. To investigate this role, we conduct an extensive evaluation of the performance of several cutting-edge DL models (i.e., Convolutional Neural Network, EfficientNet, MobileNet, Supper Vector Machine and Multi-Perceptron) with various state-of-the-art audio features (i.e., Mel Spectrogram, Mel Frequency Cepstral Coefficients, and Zero Crossing Rate) either independently or as a combination (i.e., through ensembling) on three different datasets (i.e., Free Spoken Digits Dataset, Audio Urdu Digits Dataset, and Audio Gujarati Digits Dataset). Overall, results suggest feature selection depends on both the dataset and the model. However, feature combinations should be restricted to the only features that already achieve good performances when used individually (i.e., mostly Mel Spectrogram, Mel Frequency Cepstral Coefficients). Such feature combination/ensembling enabled us to outperform the previous state-of-the-art results irrespective of our choice of DL model.


Interpreting and Explaining Deep Neural Networks for Classification of Audio Signals

Interpretability of deep neural networks is a recently emerging area of ...

Machine Learning-based Classification of Birds through Birdsong

Audio sound recognition and classification is used for many tasks and ap...

Feature Selection Using Batch-Wise Attenuation and Feature Mask Normalization

Feature selection is generally used as one of the most important pre-pro...

Feature Selection for Survival Analysis with Competing Risks using Deep Learning

Deep learning models for survival analysis have gained significant atten...

Environment Sound Classification using Multiple Feature Channels and Deep Convolutional Neural Networks

In this paper, we propose a model for the Environment Sound Classificati...

Speaker Fluency Level Classification Using Machine Learning Techniques

Level assessment for foreign language students is necessary for putting ...

From feature selection to continues optimization

Metaheuristic algorithms (MAs) have seen unprecedented growth thanks to ...

Please sign up or login with your details

Forgot password? Click here to reset