Rethinking CNN Models for Audio Classification

07/22/2020
by   Kamalesh Palanisamy, et al.
0

In this paper, we show that ImageNet-Pretrained standard deep CNN models can be used as strong baseline networks for audio classification. Even though there is a significant difference between audio Spectrogram and standard ImageNet image samples, transfer learning assumptions still hold firmly. To understand what enables the ImageNet pretrained models to learn useful audio representations, we systematically study how much of pretrained weights is useful for learning spectrograms. We show (1) that for a given standard model using pretrained weights is better than using randomly initialized weights (2) qualitative results of what the CNNs learn from the spectrograms by visualizing the gradients. Besides, we show that even though we use the pretrained model weights for initialization, there is variance in performance in various output runs of the same model. This variance in performance is due to the random initialization of linear classification layer and random mini-batch orderings in multiple runs. This brings significant diversity to build stronger ensemble models with an overall improvement in accuracy. An ensemble of ImageNet pretrained DenseNet achieves 92.89 and 87.42 state-of-the-art on both of these datasets.

READ FULL TEXT
research
05/23/2018

Do Better ImageNet Models Transfer Better?

Transfer learning has become a cornerstone of computer vision with the a...
research
10/25/2021

ZerO Initialization: Initializing Residual Networks with only Zeros and Ones

Deep neural networks are usually initialized with random weights, with a...
research
07/26/2018

A Better Baseline for AVA

We introduce a simple baseline for action localization on the AVA datase...
research
10/03/2019

An empirical study of pretrained representations for few-shot classification

Recent algorithms with state-of-the-art few-shot classification results ...
research
02/14/2019

Transfusion: Understanding Transfer Learning with Applications to Medical Imaging

With the increasingly varied applications of deep learning, transfer lea...
research
11/24/2017

An Exploration of Word Embedding Initialization in Deep-Learning Tasks

Word embeddings are the interface between the world of discrete units of...
research
05/09/2023

Application of Artificial Intelligence in the Classification of Microscopical Starch Images for Drug Formulation

Starches are important energy sources found in plants with many uses in ...

Please sign up or login with your details

Forgot password? Click here to reset