Investigating Personalization Methods in Text to Music Generation

09/20/2023
by   Manos Plitsis, et al.
0

In this work, we investigate the personalization of text-to-music diffusion models in a few-shot setting. Motivated by recent advances in the computer vision domain, we are the first to explore the combination of pre-trained text-to-audio diffusers with two established personalization methods. We experiment with the effect of audio-specific data augmentation on the overall system performance and assess different training strategies. For evaluation, we construct a novel dataset with prompts and music clips. We consider both embedding-based and music-specific metrics for quantitative evaluation, as well as a user study for qualitative evaluation. Our analysis shows that similarity metrics are in accordance with user preferences and that current personalization approaches tend to learn rhythmic music constructs more easily than melody. The code, dataset, and example material of this study are open to the research community.

READ FULL TEXT
research
08/03/2023

MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies

Diffusion models have shown promising results in cross-modal generation ...
research
11/25/2022

Can Knowledge of End-to-End Text-to-Speech Models Improve Neural MIDI-to-Audio Synthesis Systems?

With the similarity between music and speech synthesis from symbolic inp...
research
08/02/2021

Is there a "language of music-video clips" ? A qualitative and quantitative study

Recommending automatically a video given a music or a music given a vide...
research
08/26/2022

MuLan: A Joint Embedding of Music Audio and Natural Language

Music tagging and content-based retrieval systems have traditionally bee...
research
08/22/2023

Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning

Text-to-music generation (T2M-Gen) faces a major obstacle due to the sca...
research
02/11/2022

Audio Defect Detection in Music with Deep Networks

With increasing amounts of music being digitally transferred from produc...
research
08/09/2022

Pure Data and INScore: Animated notation for new music

New music is made with computers, taking advantage of its graphics displ...

Please sign up or login with your details

Forgot password? Click here to reset