Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval

07/20/2022
by   Daiki Takeuchi, et al.
0

The amount of audio data available on public websites is growing rapidly, and an efficient mechanism for accessing the desired data is necessary. We propose a content-based audio retrieval method that can retrieve a target audio that is similar to but slightly different from the query audio by introducing auxiliary textual information which describes the difference between the query and target audio. While the range of conventional content-based audio retrieval is limited to audio that is similar to the query audio, the proposed method can adjust the retrieval range by adding an embedding of the auxiliary text query-modifier to the embedding of the query sample audio in a shared latent space. To evaluate our method, we built a dataset comprising two different audio clips and the text that describes the difference. The experimental results show that the proposed method retrieves the paired audio more accurately than the baseline. We also confirmed based on visualization that the proposed method obtains the shared latent space in which the audio difference and the corresponding text are represented as similar embedding vectors.

READ FULL TEXT

page 1

page 2

page 4

research
08/23/2023

Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement

We proposed Audio Difference Captioning (ADC) as a new extension task of...
research
10/30/2017

Content-based Representations of audio using Siamese neural networks

In this paper, we focus on the problem of content-based retrieval for au...
research
06/16/2023

Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances

This paper explores grading text-based audio retrieval relevances with c...
research
10/21/2022

Decoding a Neural Retriever's Latent Space for Query Suggestion

Neural retrieval models have superseded classic bag-of-words methods suc...
research
08/29/2023

Killing two birds with one stone: Can an audio captioning system also be used for audio-text retrieval?

Automated Audio Captioning (AAC) aims to develop systems capable of desc...
research
02/28/2023

Audio Retrieval for Multimodal Design Documents: A New Dataset and Algorithms

We consider and propose a new problem of retrieving audio files relevant...
research
08/08/2023

Advancing Natural-Language Based Audio Retrieval with PaSST and Large Audio-Caption Data Sets

This work presents a text-to-audio-retrieval system based on pre-trained...

Please sign up or login with your details

Forgot password? Click here to reset