DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis with GANs

08/03/2021
by   Javier Nistal, et al.
0

Generative Adversarial Networks (GANs) have achieved excellent audio synthesis quality in the last years. However, making them operable with semantically meaningful controls remains an open challenge. An obvious approach is to control the GAN by conditioning it on metadata contained in audio datasets. Unfortunately, audio datasets often lack the desired annotations, especially in the musical domain. A way to circumvent this lack of annotations is to generate them, for example, with an automatic audio-tagging system. The output probabilities of such systems (so-called "soft labels") carry rich information about the characteristics of the respective audios and can be used to distill the knowledge from a teacher model into a student model. In this work, we perform knowledge distillation from a large audio tagging system into an adversarial audio synthesizer that we call DarkGAN. Results show that DarkGAN can synthesize musical audio with acceptable quality and exhibits moderate attribute control even with out-of-distribution input conditioning. We release the code and provide audio examples on the accompanying website.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/27/2020

DrumGAN: Synthesis of Drum Sounds With Timbral Feature Conditioning Using Generative Adversarial Networks

Synthetic creation of drum sounds (e.g., in drum machines) is commonly p...
research
05/04/2021

VQCPC-GAN: Variable-length Adversarial Audio Synthesis using Vector-Quantized Contrastive Predictive Coding

Influenced by the field of Computer Vision, Generative Adversarial Netwo...
research
03/14/2023

Feature-Rich Audio Model Inversion for Data-Free Knowledge Distillation Towards General Sound Classification

Data-Free Knowledge Distillation (DFKD) has recently attracted growing a...
research
09/03/2020

Intra-Utterance Similarity Preserving Knowledge Distillation for Audio Tagging

Knowledge Distillation (KD) is a popular area of research for reducing t...
research
02/23/2019

GANSynth: Adversarial Neural Audio Synthesis

Efficient audio synthesis is an inherently difficult machine learning ta...
research
04/02/2022

StyleWaveGAN: Style-based synthesis of drum sounds with extensive controls using generative adversarial networks

In this paper we introduce StyleWaveGAN, a style-based drum sound genera...

Please sign up or login with your details

Forgot password? Click here to reset