Synthesizing Audio with Generative Adversarial Networks

02/12/2018
by   Chris Donahue, et al.
0

While Generative Adversarial Networks (GANs) have seen wide success at the problem of synthesizing realistic images, they have seen little application to the problem of unsupervised audio generation. Unlike for images, a barrier to success is that the best discriminative representations for audio tend to be non-invertible, and thus cannot be used to synthesize listenable outputs. In this paper, we introduce WaveGAN, a first attempt at applying GANs to raw audio synthesis in an unsupervised setting. Our experiments on speech demonstrate that WaveGAN can produce intelligible words from a small vocabulary of human speech, as well as synthesize audio from other domains such as bird vocalizations, drums, and piano. Qualitatively, we find that human judges prefer the generated examples from WaveGAN over those from a method which naively apply GANs on image-like audio feature representations.

READ FULL TEXT

page 3

page 4

page 6

page 11

research
03/13/2019

Voice command generation using Progressive Wavegans

Generative Adversarial Networks (GANs) have become exceedingly popular i...
research
02/18/2019

Generative Adversarial Networks Synthesize Realistic OCT Images of the Retina

We report, to our knowledge, the first end-to-end application of Generat...
research
04/15/2021

EnvGAN: Adversarial Synthesis of Environmental Sounds for Data Augmentation

The research in Environmental Sound Classification (ESC) has been progre...
research
05/09/2023

Enhancing Gappy Speech Audio Signals with Generative Adversarial Networks

Gaps, dropouts and short clips of corrupted audio are a common problem a...
research
03/12/2021

Signal Representations for Synthesizing Audio Textures with Generative Adversarial Networks

Generative Adversarial Networks (GANs) currently achieve the state-of-th...
research
07/20/2020

Generative Hierarchical Features from Synthesizing Images

Generative Adversarial Networks (GANs) have recently advanced image synt...
research
05/04/2021

VQCPC-GAN: Variable-length Adversarial Audio Synthesis using Vector-Quantized Contrastive Predictive Coding

Influenced by the field of Computer Vision, Generative Adversarial Netwo...

Please sign up or login with your details

Forgot password? Click here to reset