CNN Architectures for Large-Scale Audio Classification

09/29/2016
by   Shawn Hershey, et al.
0

Convolutional Neural Networks (CNNs) have proven very effective in image classification and show promise for audio. We use various CNN architectures to classify the soundtracks of a dataset of 70M training videos (5.24 million hours) with 30,871 video-level labels. We examine fully connected Deep Neural Networks (DNNs), AlexNet [1], VGG [2], Inception [3], and ResNet [4]. We investigate varying the size of both training set and label vocabulary, finding that analogs of the CNNs used in image classification do well on our audio classification task, and larger training and label sets help up to a point. A model using embeddings from these classifiers does much better than raw features on the Audio Set [5] Acoustic Event Detection (AED) classification task.

READ FULL TEXT
research
11/15/2019

Cross-modal supervised learning for better acoustic representations

Obtaining large-scale human-labeled datasets to train acoustic represent...
research
05/11/2023

OneCAD: One Classifier for All image Datasets using multimodal learning

Vision-Transformers (ViTs) and Convolutional neural networks (CNNs) are ...
research
11/01/2017

Reducing Model Complexity for DNN Based Large-Scale Audio Classification

Audio classification is the task of identifying the sound categories tha...
research
05/26/2015

An Empirical Evaluation of Current Convolutional Architectures' Ability to Manage Nuisance Location and Scale Variability

We conduct an empirical study to test the ability of Convolutional Neura...
research
07/10/2021

Variational Information Bottleneck for Effective Low-resource Audio Classification

Large-scale deep neural networks (DNNs) such as convolutional neural net...
research
07/16/2020

Camera Bias in a Fine Grained Classification Task

We show that correlations between the camera used to acquire an image an...
research
12/20/2016

Exploring the Design Space of Deep Convolutional Neural Networks at Large Scale

In recent years, the research community has discovered that deep neural ...

Please sign up or login with your details

Forgot password? Click here to reset