DeepAI AI Chat
Log In Sign Up

Unsupervised Discriminative Learning of Sounds for Audio Event Classification

by   Sascha Hornauer, et al.

Recent progress in network-based audio event classification has shown the benefit of pre-training models on visual data such as ImageNet. While this process allows knowledge transfer across different domains, training a model on large-scale visual datasets is time consuming. On several audio event classification benchmarks, we show a fast and effective alternative that pre-trains the model unsupervised, only on audio data and yet delivers on-par performance with ImageNet pre-training. Furthermore, we show that our discriminative audio learning can be used to transfer knowledge across audio datasets and optionally include ImageNet pre-training.


page 3

page 4

page 5


Leveraging Large-Scale Uncurated Data for Unsupervised Pre-training of Visual Features

Pre-training general-purpose visual features with convolutional neural n...

The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection

This paper strives for video event detection using a representation lear...

Are Large-scale Datasets Necessary for Self-Supervised Pre-training?

Pre-training models on large scale datasets, like ImageNet, is a standar...

PSLA: Improving Audio Event Classification with Pretraining, Sampling, Labeling, and Aggregation

Audio event classification is an active research area and has a wide ran...

Does progress on ImageNet transfer to real-world datasets?

Does progress on ImageNet transfer to real-world datasets? We investigat...

Net2Net: Accelerating Learning via Knowledge Transfer

We introduce techniques for rapidly transferring the information stored ...