GANStrument: Adversarial Instrument Sound Synthesis with Pitch-invariant Instance Conditioning

11/10/2022
by   Gaku Narita, et al.
0

We propose GANStrument, a generative adversarial model for instrument sound synthesis. Given a one-shot sound as input, it is able to generate pitched instrument sounds that reflect the timbre of the input within an interactive time. By exploiting instance conditioning, GANStrument achieves better fidelity and diversity of synthesized sounds and generalization ability to various inputs. In addition, we introduce an adversarial training scheme for a pitch-invariant feature extractor that significantly improves the pitch accuracy and timbre consistency. Experimental results show that GANStrument outperforms strong baselines that do not use instance conditioning in terms of generation quality and input editability. Qualitative examples are available online.

READ FULL TEXT

page 3

page 4

research
05/28/2018

Real-valued parametric conditioning of an RNN for interactive sound synthesis

A Recurrent Neural Network (RNN) for audio synthesis is trained by augme...
research
08/27/2020

DrumGAN: Synthesis of Drum Sounds With Timbral Feature Conditioning Using Generative Adversarial Networks

Synthetic creation of drum sounds (e.g., in drum machines) is commonly p...
research
06/29/2022

DrumGAN VST: A Plugin for Drum Sound Analysis/Synthesis With Autoencoding Generative Adversarial Networks

In contemporary popular music production, drum sound design is commonly ...
research
09/21/2023

Performance Conditioning for Diffusion-Based Multi-Instrument Music Synthesis

Generating multi-instrument music from symbolic music representations is...
research
08/06/2019

Adversarially Trained End-to-end Korean Singing Voice Synthesis System

In this paper, we propose an end-to-end Korean singing voice synthesis s...
research
08/30/2020

Hierarchical Timbre-Painting and Articulation Generation

We present a fast and high-fidelity method for music generation, based o...
research
06/14/2020

BatVision with GCC-PHAT Features for Better Sound to Vision Predictions

Inspired by sophisticated echolocation abilities found in nature, we tra...

Please sign up or login with your details

Forgot password? Click here to reset