On Negative Sampling for Contrastive Audio-Text Retrieval

11/08/2022
by   Huang Xie, et al.
0

This paper investigates negative sampling for contrastive learning in the context of audio-text retrieval. The strategy for negative sampling refers to selecting negatives (either audio clips or textual descriptions) from a pool of candidates for a positive audio-text pair. We explore sampling strategies via model-estimated within-modality and cross-modality relevance scores for audio and text samples. With a constant training setting on the retrieval system from [1], we study eight sampling strategies, including hard and semi-hard negative sampling. Experimental results show that retrieval performance varies dramatically among different strategies. Particularly, by selecting semi-hard negatives with cross-modality scores, the retrieval system gains improved performance in both text-to-audio and audio-to-text retrieval. Besides, we show that feature collapse occurs while sampling hard negatives with cross-modality scores.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2023

Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances

This paper explores grading text-based audio retrieval relevances with c...
research
06/29/2022

Language-Based Audio Retrieval with Converging Tied Layers and Contrastive Loss

In this paper, we tackle the new Language-Based Audio Retrieval task pro...
research
10/21/2022

SimANS: Simple Ambiguous Negatives Sampling for Dense Text Retrieval

Sampling proper negatives from a large document pool is vital to effecti...
research
03/25/2022

Audio-text Retrieval in Context

Audio-text retrieval based on natural language descriptions is a challen...
research
03/10/2023

Improving Text-Audio Retrieval by Text-aware Attention Pooling and Prior Matrix Revised Loss

In text-audio retrieval (TAR) tasks, due to the heterogeneity of content...
research
03/21/2023

ModEFormer: Modality-Preserving Embedding for Audio-Video Synchronization using Transformers

Lack of audio-video synchronization is a common problem during televisio...
research
03/21/2023

Sample4Geo: Hard Negative Sampling For Cross-View Geo-Localisation

Cross-View Geo-Localisation is still a challenging task where additional...

Please sign up or login with your details

Forgot password? Click here to reset