NHSS: A Speech and Singing Parallel Database

12/01/2020
by   Bidisha Sharma, et al.
0

We present a database of parallel recordings of speech and singing, collected and released by the Human Language Technology (HLT) laboratory at the National University of Singapore (NUS), that is called NUS-HLT Speak-Sing (NHSS) database. We release this database to the public to support research activities, that include, but not limited to comparative studies of acoustic attributes of speech and singing signals, cooperative synthesis of speech and singing voices, and speech-to-singing conversion. This database consists of recordings of sung vocals of English pop songs, the spoken counterpart of lyrics of the songs read by the singers in their natural reading manner, and manually prepared utterance-level and word-level annotations. The audio recordings in the NHSS database correspond to 100 songs sung and spoken by 10 singers, resulting in a total of 7 hours of audio data. There are 5 male and 5 female singers, singing and reading the lyrics of 10 songs each. In this paper, we discuss the design methodology of the database, analyze the similarities and dissimilarities in characteristics of speech and singing voices, and provide some strategies to address relationships between these characteristics for converting one to another. We develop benchmark systems for speech-to-singing alignment, spectral mapping and conversion using the NHSS database.

READ FULL TEXT

page 3

page 5

page 7

page 8

page 9

page 10

page 14

page 15

research
01/13/2022

Speech Resources in the Tamasheq Language

In this paper we present two datasets for Tamasheq, a developing languag...
research
05/31/2021

Emotional Voice Conversion: Theory, Databases and ESD

In this paper, we first provide a review of the state-of-the-art emotion...
research
06/21/2021

Speech prosody and remote experiments: a technical report

The aim of this paper is twofold. First, we present a review of differen...
research
07/11/2013

Conversion of Braille to Text in English, Hindi and Tamil Languages

The Braille system has been used by the visually impaired for reading an...
research
01/09/2021

Spanish expressive voices: Corpus for emotion research in spanish

A new emotional multimedia database has been recorded and aligned. The d...
research
10/27/2022

Masked Autoencoders Are Articulatory Learners

Articulatory recordings track the positions and motion of different arti...
research
07/31/2023

Improving grapheme-to-phoneme conversion by learning pronunciations from speech recordings

The Grapheme-to-Phoneme (G2P) task aims to convert orthographic input in...

Please sign up or login with your details

Forgot password? Click here to reset