Sound texture synthesis using RI spectrograms

10/21/2019
by   Hugo Caracalla, et al.
0

This article introduces a new parametric synthesis method for sound textures based on existing works in visual and sound texture synthesis. Starting from a base sound signal, an optimization process is performed until the cross-correlations between the feature-maps of several untrained 2D Convolutional Neural Networks (CNN) resemble those of an original sound texture. We use compressed RI spectrograms as input to the CNN: this time-frequency representation is the stacking of the real and imaginary part of the Short Time Fourier Transform (STFT) and thus implicitly contains both the magnitude and phase information, allowing for convincing syntheses of various audio events. The optimization is however performed directly on the time signal to avoid any STFT consistency issue. The results of an online perceptual evaluation are also detailed, and show that this method achieves results that are more realistic-sounding than existing parametric methods on a wide array of textures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/09/2019

Sound texture synthesis using convolutional neural networks

The following article introduces a new parametric synthesis algorithm fo...
research
09/25/2008

Audio Classification from Time-Frequency Texture

Time-frequency representations of audio signals often resemble texture i...
research
05/27/2015

Texture Synthesis Using Convolutional Neural Networks

Here we introduce a new model of natural textures based on the feature s...
research
07/14/2020

Transposer: Universal Texture Synthesis Using Feature Maps as Transposed Convolution Filter

Conventional CNNs for texture synthesis consist of a sequence of (de)-co...
research
04/09/2020

Fast frequency discrimination and phoneme recognition using a biomimetic membrane coupled to a neural network

In the human ear, the basilar membrane plays a central role in sound rec...
research
03/30/2020

VaPar Synth – A Variational Parametric Model for Audio Synthesis

With the advent of data-driven statistical modeling and abundant computi...
research
03/04/2022

iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform

In recent text-to-speech synthesis and voice conversion systems, a mel-s...

Please sign up or login with your details

Forgot password? Click here to reset