Extreme Audio Time Stretching Using Neural Synthesis

11/30/2022
by   Leonardo Fierro, et al.
0

A deep neural network solution for time-scale modification (TSM) focused on large stretching factors is proposed, targeting environmental sounds. Traditional TSM artifacts such as transient smearing, loss of presence, and phasiness are heavily accentuated and cause poor audio quality when the TSM factor is four or larger. The weakness of established TSM methods, often based on a phase vocoder structure, lies in the poor description and scaling of the transient and noise components, or nuances, of a sound. Our novel solution combines a sines-transients-noise decomposition with an independent WaveNet synthesizer to provide a better description of the noise component and an improve sound quality for large stretching factors. Results of a subjective listening test against four other TSM algorithms are reported, showing the proposed method to be often superior. The proposed method is stereo compatible and has a wide range of applications related to the slow motion of media content.

READ FULL TEXT
research
10/25/2022

Enhanced Fuzzy Decomposition of Sound Into Sines, Transients, and Noise

The decomposition of sounds into sines, transients, and noise is a long-...
research
09/14/2023

DDSP-SFX: Acoustically-guided sound effects generation with differentiable digital signal processing

Controlling the variations of sound effects using neural audio synthesis...
research
07/09/2020

RWCP-SSD-Onomatopoeia: Onomatopoeic Word Dataset for Environmental Sound Synthesis

Environmental sound synthesis is a technique for generating a natural en...
research
03/04/2023

A General Framework for Learning Procedural Audio Models of Environmental Sounds

This paper introduces the Procedural (audio) Variational autoEncoder (Pr...
research
09/13/2023

Differentiable Modelling of Percussive Audio with Transient and Spectral Synthesis

Differentiable digital signal processing (DDSP) techniques, including me...
research
05/26/2023

Neural modeling of magnetic tape recorders

The sound of magnetic recording media, such as open-reel and cassette ta...
research
06/05/2022

Geometrically-Motivated Primary-Ambient Decomposition With Center-Channel Extraction

A geometrically-motivated method for primary-ambient decomposition is pr...

Please sign up or login with your details

Forgot password? Click here to reset