Low Resource Audio-to-Lyrics Alignment From Polyphonic Music Recordings

02/18/2021
by   Emir Demirel, et al.
0

Lyrics alignment in long music recordings can be memory exhaustive when performed in a single pass. In this study, we present a novel method that performs audio-to-lyrics alignment with a low memory consumption footprint regardless of the duration of the music recording. The proposed system first spots the anchoring words within the audio signal. With respect to these anchors, the recording is then segmented and a second-pass alignment is performed to obtain the word timings. We show that our audio-to-lyrics alignment system performs competitively with the state-of-the-art, while requiring much less computational resources. In addition, we utilise our lyrics alignment system to segment the music recordings into sentence-level chunks. Notably on the segmented recordings, we report the lyrics transcription scores on a number of benchmark test sets. Finally, our experiments highlight the importance of the source separation step for good performance on the transcription and alignment tasks. For reproducibility, we publicly share our code with the research community.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2021

Structure-Aware Audio-to-Score Alignment using Progressively Dilated Convolutional Neural Networks

The identification of structural differences between a music performance...
research
07/29/2020

Improved Handling of Repeats and Jumps in Audio-Sheet Image Synchronization

This paper studies the problem of automatically generating piano score f...
research
07/22/2016

Similarity graphs for the concealment of long duration data loss in music

We present a novel method for the compensation of long duration data gap...
research
06/29/2018

Exploratory Analysis of a Large Flamenco Corpus using an Ensemble of Convolutional Neural Networks as a Structural Annotation Backend

We present computational tools that we developed for the analysis of a l...
research
10/23/2020

A Cross-Verification Approach for Protecting World Leaders from Fake and Tampered Audio

This paper tackles the problem of verifying the authenticity of speech r...
research
08/05/2021

MSTRE-Net: Multistreaming Acoustic Modeling for Automatic Lyrics Transcription

This paper makes several contributions to automatic lyrics transcription...
research
08/24/2023

Exploiting Time-Frequency Conformers for Music Audio Enhancement

With the proliferation of video platforms on the internet, recording mus...

Please sign up or login with your details

Forgot password? Click here to reset