Analysis and transformations of intensity in singing voice

04/08/2022
by   Frederik Bous, et al.
0

In this paper we introduce a neural auto-encoder that transforms the voice intensity in recordings of singing voice. Since most recordings of singing voice are not annotated with voice intensity we propose a means to estimate the relative voice intensity from the signal's timbre using a neural intensity estimator. Two methods to overcome the unknown recording factor that relates voice intensity to recorded signal power are given: The unknown recording factor can either be learned alongside the weights of the intensity estimator, or a special loss function based on the scalar product can be used to only match the intensity contour of the recorded signal's power. The intensity models are used to condition a previously introduced bottleneck auto-encoder that disentangles its input, the mel-spectrogram, from the intensity. We evaluate the intensity models by their consistency and by their fitness to provide useful information to the auto-encoder. A perceptive test is carried out that evaluates the perceived intensity change in transformed recordings and the synthesis quality. The perceptive test confirms that changing the conditional input changes the perceived intensity accordingly thus suggesting that the proposed intensity models encode information about the voice intensity.

READ FULL TEXT
research
12/23/2017

Texture Synthesis with Recurrent Variational Auto-Encoder

We propose a recurrent variational auto-encoder for texture synthesis. A...
research
04/07/2022

Expressive Singing Synthesis Using Local Style Token and Dual-path Pitch Encoder

This paper proposes a controllable singing voice synthesis system capabl...
research
06/07/2020

VQVC+: One-Shot Voice Conversion by Vector Quantization and U-Net architecture

Voice conversion (VC) is a task that transforms the source speaker's tim...
research
09/21/2020

A Deep Learning Based Analysis-Synthesis Framework For Unison Singing

Unison singing is the name given to an ensemble of singers simultaneousl...
research
05/09/2023

Learn to Sing by Listening: Building Controllable Virtual Singer by Unsupervised Learning from Voice Recordings

The virtual world is being established in which digital humans are creat...
research
07/07/2021

Adversarial Auto-Encoding for Packet Loss Concealment

Communication technologies like voice over IP operate under constrained ...
research
02/02/2021

Generacion de voces artificiales infantiles en castellano con acento costarricense

This article evaluates a first experience of generating artificial child...

Please sign up or login with your details

Forgot password? Click here to reset