Transformer-based Sequence Labeling for Audio Classification based on MFCCs

04/30/2023
by   C. S. Sonali, et al.
0

Audio classification is vital in areas such as speech and music recognition. Feature extraction from the audio signal, such as Mel-Spectrograms and MFCCs, is a critical step in audio classification. These features are transformed into spectrograms for classification. Researchers have explored various techniques, including traditional machine and deep learning methods to classify spectrograms, but these can be computationally expensive. To simplify this process, a more straightforward approach inspired by sequence classification in NLP can be used. This paper proposes a Transformer-encoder-based model for audio classification using MFCCs. The model was benchmarked against the ESC-50, Speech Commands v0.02 and UrbanSound8k datasets and has shown strong performance, with the highest accuracy of 95.2 model on the UrbanSound8k dataset. The model consisted of a mere 127,544 total parameters, making it light-weight yet highly efficient at the audio classification task.

READ FULL TEXT
research
05/30/2023

Audio classification using ML methods

Machine Learning systems have achieved outstanding performance in differ...
research
11/09/2018

Audio Spectrogram Factorization for Classification of Telephony Signals below the Auditory Threshold

Traffic Pumping attacks are a form of high-volume SPAM that target telep...
research
07/19/2022

GAFX: A General Audio Feature eXtractor

Most machine learning models for audio tasks are dealing with a handcraf...
research
07/12/2022

EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use

In audio classification, differentiable auditory filterbanks with few pa...
research
08/31/2018

Speaker Fluency Level Classification Using Machine Learning Techniques

Level assessment for foreign language students is necessary for putting ...
research
10/07/2020

Improving the efficiency of spectral features extraction by structuring the audio files

The extraction of spectral features from a music clip is a computational...
research
05/22/2023

LEAN: Light and Efficient Audio Classification Network

Over the past few years, audio classification task on large-scale datase...

Please sign up or login with your details

Forgot password? Click here to reset