Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet

10/05/2021
by   Max Morrison, et al.
0

Modifying the pitch and timing of an audio signal are fundamental audio editing operations with applications in speech manipulation, audio-visual synchronization, and singing voice editing and synthesis. Thus far, methods for pitch-shifting and time-stretching that use digital signal processing (DSP) have been favored over deep learning approaches due to their speed and relatively higher quality. However, even existing DSP-based methods for pitch-shifting and time-stretching induce artifacts that degrade audio quality. In this paper, we propose Controllable LPCNet (CLPCNet), an improved LPCNet vocoder capable of pitch-shifting and time-stretching of speech. For objective evaluation, we show that CLPCNet performs pitch-shifting of speech on unseen datasets with high accuracy relative to prior neural methods. For subjective evaluation, we demonstrate that the quality and naturalness of pitch-shifting and time-stretching with CLPCNet on unseen datasets meets or exceeds competitive neural- or DSP-based approaches.

READ FULL TEXT
research
10/06/2021

EdiTTS: Score-based Editing for Controllable Text-to-Speech

We present EdiTTS, an off-the-shelf speech editing methodology based on ...
research
04/22/2019

hf0: A hybrid pitch extraction method for multimodal voice

Pitch or fundamental frequency (f0) extraction is a fundamental problem ...
research
02/22/2022

Wavebender GAN: An architecture for phonetically meaningful speech manipulation

Deep learning has revolutionised synthetic speech quality. However, it h...
research
10/27/2020

Upsampling artifacts in neural audio synthesis

A number of recent advances in audio synthesis rely on neural upsamplers...
research
07/11/2023

Point to the Hidden: Exposing Speech Audio Splicing via Signal Pointer Nets

Verifying the integrity of voice recording evidence for criminal investi...
research
04/14/2022

Streamable Neural Audio Synthesis With Non-Causal Convolutions

Deep learning models are mostly used in an offline inference fashion. Ho...
research
03/09/2023

CoralStyleCLIP: Co-optimized Region and Layer Selection for Image Editing

Edit fidelity is a significant issue in open-world controllable generati...

Please sign up or login with your details

Forgot password? Click here to reset