TimbreCLIP: Connecting Timbre to Text and Images

11/21/2022
by   Nicolas Jonason, et al.
0

We present work in progress on TimbreCLIP, an audio-text cross modal embedding trained on single instrument notes. We evaluate the models with a cross-modal retrieval task on synth patches. Finally, we demonstrate the application of TimbreCLIP on two tasks: text-driven audio equalization and timbre to image generation.

READ FULL TEXT

page 3

page 4

research
05/05/2021

Audio Retrieval with Natural Language Queries

We consider the task of retrieving audio using free-form natural languag...
research
07/23/2019

Multisensory Learning Framework for Robot Drumming

The hype about sensorimotor learning is currently reaching high fever, t...
research
07/28/2023

Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions

Most existing audio-text retrieval (ATR) methods focus on constructing c...
research
05/02/2018

Learnable PINs: Cross-Modal Embeddings for Person Identity

We propose and investigate an identity sensitive joint embedding of face...
research
12/30/2021

Audio-to-symbolic Arrangement via Cross-modal Music Representation Learning

Could we automatically derive the score of a piano accompaniment based o...
research
01/06/2021

Multi-Stage Residual Hiding for Image-into-Audio Steganography

The widespread application of audio communication technologies has speed...
research
12/18/2017

Objects that Sound

In this paper our objectives are, first, networks that can embed audio a...

Please sign up or login with your details

Forgot password? Click here to reset