Transfer Learning for Improving Singing-voice Detection in Polyphonic Instrumental Music

08/11/2020
by   Yuanbo Hou, et al.
0

Detecting singing-voice in polyphonic instrumental music is critical to music information retrieval. To train a robust vocal detector, a large dataset marked with vocal or non-vocal label at frame-level is essential. However, frame-level labeling is time-consuming and labor expensive, resulting there is little well-labeled dataset available for singing-voice detection (S-VD). Hence, we propose a data augmentation method for S-VD by transfer learning. In this study, clean speech clips with voice activity endpoints and separate instrumental music clips are artificially added together to simulate polyphonic vocals to train a vocal/non-vocal detector. Due to the different articulation and phonation between speaking and singing, the vocal detector trained with the artificial dataset does not match well with the polyphonic music which is singing vocals together with the instrumental accompaniments. To reduce this mismatch, transfer learning is used to transfer the knowledge learned from the artificial speech-plus-music training set to a small but matched polyphonic dataset, i.e., singing vocals with accompaniments. By transferring the related knowledge to make up for the lack of well-labeled training data in S-VD, the proposed data augmentation method by transfer learning can improve S-VD performance with an F-score improvement from 89.5

READ FULL TEXT
research
06/01/2023

Transfer Learning for Underrepresented Music Generation

This paper investigates a combinational creativity approach to transfer ...
research
02/17/2021

End-to-end lyrics Recognition with Voice to Singing Style Transfer

Automatic transcription of monophonic/polyphonic music is a challenging ...
research
04/14/2023

Adapting Meter Tracking Models to Latin American Music

Beat and downbeat tracking models have improved significantly in recent ...
research
04/13/2021

Detecting Escalation Level from Speech with Transfer Learning and Acoustic-Lexical Information Fusion

Textual escalation detection has been widely applied to e-commerce compa...
research
03/24/2021

Transfer Learning for Piano Sustain-Pedal Detection

Detecting piano pedalling techniques in polyphonic music remains a chall...
research
04/12/2023

A Phoneme-Informed Neural Network Model for Note-Level Singing Transcription

Note-level automatic music transcription is one of the most representati...
research
03/04/2019

Improving singing voice separation using Deep U-Net and Wave-U-Net with data augmentation

State-of-the-art singing voice separation is based on deep learning maki...

Please sign up or login with your details

Forgot password? Click here to reset