DeepAI AI Chat
Log In Sign Up

Deep Clustering For General-Purpose Audio Representations

10/17/2021
by   Sreyan Ghosh, et al.
0

We introduce DECAR, a self-supervised pre-training approach for learning general-purpose audio representations. Our system is based on clustering: it utilizes an offline clustering step to provide target labels that act as pseudo-labels for solving a prediction task. We develop on top of recent advances in self-supervised learning for computer vision and design a lightweight, easy-to-use self-supervised pre-training scheme. We pre-train DECAR embeddings on a balanced subset of the large-scale Audioset dataset and transfer those representations to 9 downstream classification tasks, including speech, music, animal sounds, and acoustic scenes. Furthermore, we conduct ablation studies identifying key design choices and also make all our code and pre-trained models publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

10/21/2020

Contrastive Learning of General-Purpose Audio Representations

We introduce COLA, a self-supervised pre-training approach for learning ...
06/24/2022

BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping

Methods for extracting audio and speech features have been studied since...
11/09/2021

Membership Inference Attacks Against Self-supervised Speech Models

Recently, adapting the idea of self-supervised learning (SSL) on continu...
02/27/2023

Internet Explorer: Targeted Representation Learning on the Open Web

Modern vision models typically rely on fine-tuning general-purpose model...
11/28/2022

Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation

Current computer vision models, unlike the human visual system, cannot y...
06/14/2021

A Self-Supervised Framework for Function Learning and Extrapolation

Understanding how agents learn to generalize – and, in particular, to ex...
03/25/2022

DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio Representation Learning

Inspired by the recent progress in self-supervised learning for computer...