Music Auto-tagging Using CNNs and Mel-spectrograms With Reduced Frequency and Time Resolution

11/12/2019
by   Andres Ferraro, et al.
0

Automatic tagging of music is an important research topic in Music Information Retrieval achieved improvements with advances in deep learning. In particular, many state-of-the-art systems use Convolutional Neural Networks and operate on mel-spectrogram representations of the audio. In this paper, we compare commonly used mel-spectrogram representations and evaluate model performances that can be achieved by reducing the input size in terms of both lesser amount of frequency bands and larger frame rates. We use the MagnaTagaTune dataset for comprehensive performance comparisons and then compare selected configurations on the larger Million Song Dataset. The results of this study can serve researchers and practitioners in their trade-off decision between accuracy of the models, data storage size and training and inference times.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/06/2017

Sample-level Deep Convolutional Neural Networks for Music Auto-tagging Using Raw Waveforms

Recently, the end-to-end approach that learns hierarchical representatio...
research
09/06/2017

A Comparison on Audio Signal Preprocessing Methods for Deep Neural Networks on Music Tagging

Deep neural networks (DNN) have been successfully applied for music clas...
research
06/01/2020

Evaluation of CNN-based Automatic Music Tagging Models

Recent advances in deep learning accelerated the development of content-...
research
06/20/2019

Adversarial Learning for Improved Onsets and Frames Music Transcription

Automatic music transcription is considered to be one of the hardest pro...
research
06/07/2017

The Effects of Noisy Labels on Deep Convolutional Neural Networks for Music Tagging

Deep neural networks (DNN) have been successfully applied to music class...
research
04/09/2021

Larger-Context Tagging: When and Why Does It Work?

The development of neural networks and pretraining techniques has spawne...
research
11/13/2017

Invariances and Data Augmentation for Supervised Music Transcription

This paper explores a variety of models for frame-based music transcript...

Please sign up or login with your details

Forgot password? Click here to reset