Transcribing Lyrics From Commercial Song Audio: The First Step Towards Singing Content Processing

04/15/2018
by   Che-Ping Tsai, et al.
0

Spoken content processing (such as retrieval and browsing) is maturing, but the singing content is still almost completely left out. Songs are human voice carrying plenty of semantic information just as speech, and may be considered as a special type of speech with highly flexible prosody. The various problems in song audio, for example the significantly changing phone duration over highly flexible pitch contours, make the recognition of lyrics from song audio much more difficult. This paper reports an initial attempt towards this goal. We collected music-removed version of English songs directly from commercial singing content. The best results were obtained by TDNN-LSTM with data augmentation with 3-fold speed perturbation plus some special approaches. The WER achieved (73.90 still relatively high.

READ FULL TEXT
research
10/21/2019

Clotho: An Audio Captioning Dataset

Audio captioning is the novel task of general audio content description ...
research
11/01/2022

Understanding Acoustic Patterns of Human Teachers Demonstrating Manipulation Tasks to Robots

Humans use audio signals in the form of spoken language or verbal reacti...
research
09/16/2021

PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription

Automatic lyrics transcription (ALT), which can be regarded as automatic...
research
02/19/2021

Artificially Synthesising Data for Audio Classification and Segmentation to Improve Speech and Music Detection in Radio Broadcast

Segmenting audio into homogeneous sections such as music and speech help...
research
02/16/2022

ADIMA: Abuse Detection In Multilingual Audio

Abusive content detection in spoken text can be addressed by performing ...
research
08/28/2016

Hierarchical Attention Model for Improved Machine Comprehension of Spoken Content

Multimedia or spoken content presents more attractive information than p...
research
09/25/2019

MPEG-H Audio for Improving Accessibility in Broadcasting and Streaming

Broadcasting and streaming services still suffer from various levels of ...

Please sign up or login with your details

Forgot password? Click here to reset