Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision

11/14/2019
by   Aren Jansen, et al.
0

Humans do not acquire perceptual abilities in the way we train machines. While machine learning algorithms typically operate on large collections of randomly-chosen, explicitly-labeled examples, human acquisition relies more heavily on multimodal unsupervised learning (as infants) and active learning (as children). With this motivation, we present a learning framework for sound representation and recognition that combines (i) a self-supervised objective based on a general notion of unimodal and cross-modal coincidence, (ii) a clustering objective that reflects our need to impose categorical structure on our experiences, and (iii) a cluster-based active learning procedure that solicits targeted weak supervision to consolidate categories into relevant semantic classes. By training a combined sound embedding/clustering/classification network according to these criteria, we achieve a new state-of-the-art unsupervised audio representation and demonstrate up to a 20-fold reduction in the number of labels required to reach a desired classification performance.

READ FULL TEXT
research
11/06/2017

Unsupervised Learning of Semantic Audio Representations

Even in the absence of any explicit semantic annotation, vast collection...
research
05/12/2022

A Computational Acquisition Model for Multimodal Word Categorization

Recent advances in self-supervised modeling of text and images open new ...
research
07/06/2022

Mitigating shortage of labeled data using clustering-based active learning with diversity exploration

In this paper, we proposed a new clustering-based active learning framew...
research
06/07/2023

NTKCPL: Active Learning on Top of Self-Supervised Model by Estimating True Coverage

High annotation cost for training machine learning classifiers has drive...
research
10/30/2020

Learning Structured Representations of Entity Names using Active Learning and Weak Supervision

Structured representations of entity names are useful for many entity-re...
research
04/26/2017

On Using Active Learning and Self-Training when Mining Performance Discussions on Stack Overflow

Abundant data is the key to successful machine learning. However, superv...
research
01/09/2019

Guess What's on my Screen? Clustering Smartphone Screenshots with Active Learning

A significant proportion of individuals' daily activities is experienced...

Please sign up or login with your details

Forgot password? Click here to reset