A Study of Few-Shot Audio Classification

12/02/2020
by   Piper Wolters, et al.
8

Advances in deep learning have resulted in state-of-the-art performance for many audio classification tasks but, unlike humans, these systems traditionally require large amounts of data to make accurate predictions. Not every person or organization has access to those resources, and the organizations that do, like our field at large, do not reflect the demographics of our country. Enabling people to use machine learning without significant resource hurdles is important, because machine learning is an increasingly useful tool for solving problems, and can solve a broader set of problems when put in the hands of a broader set of people. Few-shot learning is a type of machine learning designed to enable the model to generalize to new classes with very few examples. In this research, we address two audio classification tasks (speaker identification and activity classification) with the Prototypical Network few-shot learning algorithm, and assess performance of various encoder architectures. Our encoders include recurrent neural networks, as well as one- and two-dimensional convolutional neural networks. We evaluate our model for speaker identification on the VoxCeleb dataset and ICSI Meeting Corpus, obtaining 5-shot 5-way accuracies of 93.5 evaluate for activity classification from audio using few-shot subsets of the Kinetics 600 dataset and AudioSet, both drawn from Youtube videos, obtaining 51.5

READ FULL TEXT
research
09/14/2019

Metric-Based Few-Shot Learning for Video Action Recognition

In the few-shot scenario, a learner must effectively generalize to unsee...
research
04/24/2022

Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention

Although few-shot learning has attracted much attention from the fields ...
research
10/01/2022

Offline Handwritten Amharic Character Recognition Using Few-shot Learning

Few-shot learning is an important, but challenging problem of machine le...
research
10/18/2021

Who calls the shots? Rethinking Few-Shot Learning for Audio

Few-shot learning aims to train models that can recognize novel classes ...
research
07/29/2021

Fine-Grained Classroom Activity Detection from Audio with Neural Networks

Instructors are increasingly incorporating student-centered learning tec...
research
05/22/2020

One of these (Few) Things is Not Like the Others

To perform well, most deep learning based image classification systems r...
research
09/11/2018

One-Shot Speaker Identification for a Service Robot using a CNN-based Generic Verifier

In service robotics, there is an interest to identify the user by voice ...

Please sign up or login with your details

Forgot password? Click here to reset