To catch a chorus, verse, intro, or anything else: Analyzing a song with structural functions

05/29/2022
by   Ju-Chiang Wang, et al.
0

Conventional music structure analysis algorithms aim to divide a song into segments and to group them with abstract labels (e.g., 'A', 'B', and 'C'). However, explicitly identifying the function of each segment (e.g., 'verse' or 'chorus') is rarely attempted, but has many applications. We introduce a multi-task deep learning framework to model these structural semantic labels directly from audio by estimating "verseness," "chorusness," and so forth, as a function of time. We propose a 7-class taxonomy (i.e., intro, verse, chorus, bridge, outro, instrumental, and silence) and provide rules to consolidate annotations from four disparate datasets. We also propose to use a spectral-temporal Transformer-based model, called SpecTNT, which can be trained with an additional connectionist temporal localization (CTL) loss. In cross-dataset evaluations using four public datasets, we demonstrate the effectiveness of the SpecTNT model and CTL loss, and obtain strong results overall: the proposed system outperforms state-of-the-art chorus-detection and boundary-detection methods at detecting choruses and boundaries, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2022

MuSFA: Improving Music Structural Function Analysis with Partially Labeled Data

Music structure analysis (MSA) systems aim to segment a song recording i...
research
03/26/2021

Supervised Chorus Detection for Popular Music Using Convolutional Neural Network and Multi-task Learning

This paper presents a novel supervised approach to detecting the chorus ...
research
03/24/2023

Symbolic Music Structure Analysis with Graph Representations and Changepoint Detection Methods

Music Structure Analysis is an open research task in Music Information R...
research
04/13/2022

Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset and Multimodal Method for Temporal Forgery Localization

Due to its high societal impact, deepfake detection is getting active at...
research
05/03/2023

"Glitch in the Matrix!": A Large Scale Benchmark for Content Driven Audio-Visual Forgery Detection and Localization

Most deepfake detection methods focus on detecting spatial and/or spatio...
research
10/18/2021

Supervised Metric Learning for Music Structure Feature

Music structure analysis (MSA) methods traditionally search for musicall...
research
05/27/2023

A Match Made in Heaven: A Multi-task Framework for Hyperbole and Metaphor Detection

Hyperbole and metaphor are common in day-to-day communication (e.g., "I ...

Please sign up or login with your details

Forgot password? Click here to reset