Assisted Sound Sample Generation with Musical Conditioning in Adversarial Auto-Encoders

04/12/2019
by   Adrien Bitton, et al.
0

Generative models have thrived in computer vision, enabling unprecedented image processes. Yet the results in audio remain less advanced. Our project targets real-time sound synthesis from a reduced set of high-level parameters, including semantic controls that can be adapted to different sound libraries and specific tags. These generative variables should allow expressive modulations of target musical qualities and continuously mix into new styles. To this extent we train AEs on an orchestral database of individual note samples, along with their intrinsic attributes: note class, timbre domain and extended playing techniques. We condition the decoder for control over the rendered note attributes and use latent adversarial training for learning expressive style parameters that can ultimately be mixed. We evaluate both generative performances and latent representation. Our ablation study demonstrates the effectiveness of the musical conditioning mechanisms. The proposed model generates notes as magnitude spectrograms from any probabilistic latent code samples, with expressive control of orchestral timbres and playing styles. Its training data subsets can directly be visualized in the 3D latent representation. Waveform rendering can be done offline with GLA. In order to allow real-time interactions, we fine-tune the decoder with a pretrained MCNN and embed the full waveform generation pipeline in a plugin. Moreover the encoder could be used to process new input samples, after manipulating their latent attribute representation, the decoder can generate sample variations as an audio effect would. Our solution remains rather fast to train, it can directly be applied to other sound domains, including an user's libraries with custom sound tags that could be mapped to specific generative controls. As a result, it fosters creativity and intuitive audio style experimentations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2018

Modulated Variational auto-Encoders for many-to-many musical timbre transfer

Generative models have been successfully applied to image style transfer...
research
02/27/2023

Continuous descriptor-based control for deep audio synthesis

Despite significant advances in deep models for music generation, the us...
research
12/17/2021

MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

Musical expression requires control of both what notes are played, and h...
research
08/31/2022

Sketching the Expression: Flexible Rendering of Expressive Piano Performance with Self-Supervised Learning

We propose a system for rendering a symbolic piano performance with flex...
research
08/04/2020

Neural Granular Sound Synthesis

Granular sound synthesis is a popular audio generation technique based o...
research
01/07/2022

Audio representations for deep learning in sound synthesis: A review

The rise of deep learning algorithms has led many researchers to withdra...
research
04/05/2017

Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders

Generative models in vision have seen rapid progress due to algorithmic ...

Please sign up or login with your details

Forgot password? Click here to reset