Improving Lyrics Alignment through Joint Pitch Detection

02/03/2022
by   Jiawen Huang, et al.
0

In recent years, the accuracy of automatic lyrics alignment methods has increased considerably. Yet, many current approaches employ frameworks designed for automatic speech recognition (ASR) and do not exploit properties specific to music. Pitch is one important musical attribute of singing voice but it is often ignored by current systems as the lyrics content is considered independent of the pitch. In practice, however, there is a temporal correlation between the two as note starts often correlate with phoneme starts. At the same time the pitch is usually annotated with high temporal accuracy in ground truth data while the timing of lyrics is often only available at the line (or word) level. In this paper, we propose a multi-task learning approach for lyrics alignment that incorporates pitch and thus can make use of a new source of highly accurate temporal information. Our results show that the accuracy of the alignment result is indeed improved by our approach. As an additional contribution, we show that integrating boundary detection in the forced-alignment algorithm reduces cross-line errors, which improves the accuracy even further.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2021

Long-Running Speech Recognizer:An End-to-End Multi-Task Learning Framework for Online ASR and VAD

When we use End-to-end automatic speech recognition (E2E-ASR) system for...
research
04/13/2020

Punctuation Prediction in Spontaneous Conversations: Can We Mitigate ASR Errors with Retrofitted Word Embeddings?

Automatic Speech Recognition (ASR) systems introduce word errors, which ...
research
10/18/2022

HMM vs. CTC for Automatic Speech Recognition: Comparison Based on Full-Sum Training from Scratch

In this work, we compare from-scratch sequence-level cross-entropy (full...
research
06/13/2023

Contrastive Learning-Based Audio to Lyrics Alignment for Multiple Languages

Lyrics alignment gained considerable attention in recent years. State-of...
research
06/25/2019

Acoustic Modeling for Automatic Lyrics-to-Audio Alignment

Automatic lyrics to polyphonic audio alignment is a challenging task not...
research
11/04/2021

MT3: Multi-Task Multitrack Music Transcription

Automatic Music Transcription (AMT), inferring musical notes from raw au...
research
08/05/2021

MSTRE-Net: Multistreaming Acoustic Modeling for Automatic Lyrics Transcription

This paper makes several contributions to automatic lyrics transcription...

Please sign up or login with your details

Forgot password? Click here to reset