Musika! Fast Infinite Waveform Music Generation

08/18/2022
by   Marco Pasini, et al.
0

Fast and user-controllable music generation could enable novel ways of composing or performing music. However, state-of-the-art music generation systems require large amounts of data and computational resources for training, and are slow at inference. This makes them impractical for real-time interactive use. In this work, we introduce Musika, a music generation system that can be trained on hundreds of hours of music using a single consumer GPU, and that allows for much faster than real-time generation of music of arbitrary length on a consumer CPU. We achieve this by first learning a compact invertible representation of spectrogram magnitudes and phases with adversarial autoencoders, then training a Generative Adversarial Network (GAN) on this representation for a particular music domain. A latent coordinate system enables generating arbitrarily long sequences of excerpts in parallel, while a global context vector allows the music to remain stylistically coherent through time. We perform quantitative evaluations to assess the quality of the generated samples and showcase options for user control in piano and techno music generation. We release the source code and pretrained autoencoder weights at github.com/marcoppasini/musika, such that a GAN can be trained on a new music domain with a single GPU in a matter of hours.

READ FULL TEXT

page 3

page 5

research
01/27/2023

Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion

The recent surge in popularity of diffusion models for image generation ...
research
09/15/2023

Stack-and-Delay: a new codebook pattern for music generation

In language modeling based music generation, a generated waveform is rep...
research
11/16/2021

Video Background Music Generation with Controllable Music Transformer

In this work, we address the task of video background music generation. ...
research
09/19/2017

Interactive Music Generation with Positional Constraints using Anticipation-RNNs

Recurrent Neural Networks (RNNS) are now widely used on sequence generat...
research
11/10/2022

Vis2Mus: Exploring Multimodal Representation Mapping for Controllable Music Generation

In this study, we explore the representation mapping from the domain of ...
research
11/25/2020

Can GAN originate new electronic dance music genres? – Generating novel rhythm patterns using GAN with Genre Ambiguity Loss

Since the introduction of deep learning, researchers have proposed conte...
research
08/04/2020

TOAD-GAN: Coherent Style Level Generation from a Single Example

In this work, we present TOAD-GAN (Token-based One-shot Arbitrary Dimens...

Please sign up or login with your details

Forgot password? Click here to reset