Semi-supervised learning using teacher-student models for vocal melody extraction

08/14/2020
by   Sangeun Kum, et al.
0

The lack of labeled data is a major obstacle in many music information retrieval tasks such as melody extraction, where labeling is extremely laborious or costly. Semi-supervised learning (SSL) provides a solution to alleviate the issue by leveraging a large amount of unlabeled data. In this paper, we propose an SSL method using teacher-student models for vocal melody extraction. The teacher model is pre-trained with labeled data and guides the student model to make identical predictions given unlabeled input in a self-training setting. We examine three setups of teacher-student models with different data augmentation schemes and loss functions. Also, considering the scarcity of labeled data in the test phase, we artificially generate large-scale testing data with pitch labels from unlabeled data using an analysis-synthesis method. The results show that the SSL method significantly increases the performance against supervised learning only and the improvement depends on the teacher-student models, the size of unlabeled data, the number of self-training iterations, and other training details. We also find that it is essential to ensure that the unlabeled audio has vocal parts. Finally, we show that the proposed SSL method enables a baseline convolutional recurrent neural network model to achieve performance comparable to state-of-the-arts.

READ FULL TEXT
research
11/18/2022

Self-Transriber: Few-shot Lyrics Transcription with Self-training

The current lyrics transcription approaches heavily rely on supervised l...
research
06/03/2021

Noisy student-teacher training for robust keyword spotting

We propose self-training with noisy student-teacher approach for streami...
research
02/22/2022

A Semi-Supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues

Identifying breakdowns in ongoing dialogues helps to improve communicati...
research
03/25/2022

Pseudo-Label Transfer from Frame-Level to Note-Level in a Teacher-Student Framework for Singing Transcription from Polyphonic Music

Lack of large-scale note-level labeled data is the major obstacle to sin...
research
02/10/2023

Q-Match: Self-supervised Learning by Matching Distributions Induced by a Queue

In semi-supervised learning, student-teacher distribution matching has b...
research
12/02/2021

From Consensus to Disagreement: Multi-Teacher Distillation for Semi-Supervised Relation Extraction

Lack of labeled data is a main obstacle in relation extraction. Semi-sup...
research
11/24/2019

DeepMimic: Mentor-Student Unlabeled Data Based Training

In this paper, we present a deep neural network (DNN) training approach ...

Please sign up or login with your details

Forgot password? Click here to reset