One Billion Audio Sounds from GPU-enabled Modular Synthesis

04/27/2021
by   Joseph Turian, et al.
0

We release synth1B1, a multi-modal audio corpus consisting of 1 billion 4-second synthesized sounds, which is 100x larger than any audio dataset in the literature. Each sound is paired with the corresponding latent parameters used to generate it. synth1B1 samples are deterministically generated on-the-fly 16200x faster than real-time (714MHz) on a single GPU using torchsynth (https://github.com/torchsynth/torchsynth), an open-source modular synthesizer we release. Additionally, we release two new audio datasets: FM synth timbre (https://zenodo.org/record/4677102) and subtractive synth pitch (https://zenodo.org/record/4677097). Using these datasets, we demonstrate new rank-based synthesizer-motivated evaluation criteria for existing audio representations. Finally, we propose novel approaches to synthesizer hyperparameter optimization, and demonstrate how perceptually-correlated auditory distances could enable new applications in synthesizer design.

READ FULL TEXT
research
04/22/2021

Distilling Audio-Visual Knowledge by Compositional Contrastive Learning

Having access to multi-modal cues (e.g. vision and audio) empowers some ...
research
01/30/2023

ArchiSound: Audio Generation with Diffusion

The recent surge in popularity of diffusion models for image generation ...
research
10/25/2020

APB2FaceV2: Real-Time Audio-Guided Multi-Face Reenactment

Audio-guided face reenactment aims to generate a photorealistic face tha...
research
01/27/2023

Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion

The recent surge in popularity of diffusion models for image generation ...
research
04/20/2020

ViSQOL v3: An Open Source Production Ready Objective Speech and Audio Metric

Estimation of perceptual quality in audio and speech is possible using a...
research
10/11/2022

Match Cutting: Finding Cuts with Smooth Visual Transitions

A match cut is a transition between a pair of shots that uses similar fr...
research
03/12/2021

Real-time Timbre Transfer and Sound Synthesis using DDSP

Neural audio synthesis is an actively researched topic, having yielded a...

Please sign up or login with your details

Forgot password? Click here to reset