Representation Learning for the Automatic Indexing of Sound Effects Libraries

08/18/2022
by   Alison B. Ma, et al.
4

Labeling and maintaining a commercial sound effects library is a time-consuming task exacerbated by databases that continually grow in size and undergo taxonomy updates. Moreover, sound search and taxonomy creation are complicated by non-uniform metadata, an unrelenting problem even with the introduction of a new industry standard, the Universal Category System. To address these problems and overcome dataset-dependent limitations that inhibit the successful training of deep learning models, we pursue representation learning to train generalized embeddings that can be used for a wide variety of sound effects libraries and are a taxonomy-agnostic representation of sound. We show that a task-specific but dataset-independent representation can successfully address data issues such as class imbalance, inconsistent class labels, and insufficient dataset size, outperforming established representations such as OpenL3. Detailed experimental results show the impact of metric learning approaches and different cross-dataset training methods on representational effectiveness.

READ FULL TEXT

page 1

page 2

page 4

page 5

page 6

page 7

page 8

page 9

research
09/26/2018

An extensible cluster-graph taxonomy for open set sound scene analysis

We present a new extensible and divisible taxonomy for open set sound sc...
research
05/22/2023

Learning to detect an animal sound from five examples

Automatic detection and classification of animal sounds has many applica...
research
04/23/2021

DeepCAT: Deep Category Representation for Query Understanding in E-commerce Search

Mapping a search query to a set of relevant categories in the product ta...
research
12/17/2021

Soundify: Matching Sound Effects to Video

In the art of video editing, sound is really half the story. A skilled v...
research
07/18/2019

Learning Effective Embeddings From Crowdsourced Labels: An Educational Case Study

Learning representation has been proven to be helpful in numerous machin...
research
06/03/2017

See, Hear, and Read: Deep Aligned Representations

We capitalize on large amounts of readily-available, synchronous data to...
research
08/12/2023

Visualising category recoding and numeric redistributions

This paper proposes graphical representations of data and rationale prov...

Please sign up or login with your details

Forgot password? Click here to reset