Singing voice phoneme segmentation by hierarchically inferring syllable and phoneme onset positions

06/05/2018
by   Rong Gong, et al.
0

In this paper, we tackle the singing voice phoneme segmentation problem in the singing training scenario by using language-independent information -- onset and prior coarse duration. We propose a two-step method. In the first step, we jointly calculate the syllable and phoneme onset detection functions (ODFs) using a convolutional neural network (CNN). In the second step, the syllable and phoneme boundaries and labels are inferred hierarchically by using a duration-informed hidden Markov model (HMM). To achieve the inference, we incorporate the a priori duration model as the transition probabilities and the ODFs as the emission probabilities into the HMM. The proposed method is designed in a language-independent way such that no phoneme class labels are used. For the model training and algorithm evaluation, we collect a new jingju (also known as Beijing or Peking opera) solo singing voice dataset and manually annotate the boundaries and labels at phrase, syllable and phoneme levels. The dataset is publicly available. The proposed method is compared with a baseline method based on hidden semi-Markov model (HSMM) forced alignment. The evaluation results show that the proposed method outperforms the baseline by a large margin regarding both segmentation and onset detection tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/12/2017

Score-informed syllable segmentation for a cappella singing voice with convolutional neural networks

This paper introduces a new score-informed method for the segmentation o...
research
07/12/2017

Audio to score matching by combining phonetic and duration information

We approach the singing phrase audio to score matching problem by using ...
research
12/27/2019

Synthesising Expressiveness in Peking Opera via Duration Informed Attention Network

This paper presents a method that generates expressive singing voice of ...
research
11/11/2017

Parkinson's Disease Digital Biomarker Discovery with Optimized Transitions and Inferred Markov Emissions

We search for digital biomarkers from Parkinson's Disease by observing a...
research
01/29/2017

Rhythm Transcription of Polyphonic Piano Music Based on Merged-Output HMM for Multiple Voices

In a recent conference paper, we have reported a rhythm transcription me...
research
08/17/2021

Neonatal Bowel Sound Detection Using Convolutional Neural Network and Laplace Hidden Semi-Markov Model

Abdominal auscultation is a convenient, safe and inexpensive method to a...
research
10/30/2022

Reward Shaping Using Convolutional Neural Network

In this paper, we propose Value Iteration Network for Reward Shaping (VI...

Please sign up or login with your details

Forgot password? Click here to reset