MetaAudio: A Few-Shot Audio Classification Benchmark

04/05/2022
by   Calum Heggan, et al.
0

Currently available benchmarks for few-shot learning (machine learning with few training examples) are limited in the domains they cover, primarily focusing on image classification. This work aims to alleviate this reliance on image-based benchmarks by offering the first comprehensive, public and fully reproducible audio based alternative, covering a variety of sound domains and experimental settings. We compare the few-shot classification performance of a variety of techniques on seven audio datasets (spanning environmental sounds to human-speech). Extending this, we carry out in-depth analyses of joint training (where all datasets are used during training) and cross-dataset adaptation protocols, establishing the possibility of a generalised audio few-shot classification algorithm. Our experimentation shows gradient-based meta-learning methods such as MAML and Meta-Curvature consistently outperform both metric and baseline methods. We also demonstrate that the joint training routine helps overall generalisation for the environmental sound databases included, as well as being a somewhat-effective method of tackling the cross-dataset/domain setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/31/2022

NeurIPS'22 Cross-Domain MetaDL competition: Design and baseline results

We present the design and baseline results for a new challenge in the Ch...
research
10/29/2021

Domain Agnostic Few-Shot Learning For Document Intelligence

Few-shot learning aims to generalize to novel classes with only a few sa...
research
04/22/2020

Learning to Classify Intents and Slot Labels Given a Handful of Examples

Intent classification (IC) and slot filling (SF) are core components in ...
research
03/16/2021

Repurposing Pretrained Models for Robust Out-of-domain Few-Shot Learning

Model-agnostic meta-learning (MAML) is a popular method for few-shot lea...
research
05/12/2023

Meta-Optimization for Higher Model Generalizability in Single-Image Depth Prediction

Model generalizability to unseen datasets, concerned with in-the-wild ro...
research
06/24/2021

AudioCLIP: Extending CLIP to Image, Text and Audio

In the past, the rapidly evolving field of sound classification greatly ...

Please sign up or login with your details

Forgot password? Click here to reset