Knowledge Transfer from Weakly Labeled Audio using Convolutional Neural Network for Sound Events and Scenes

11/04/2017
by   Anurag Kumar, et al.
0

In this work we propose approaches to effectively transfer knowledge from weakly labeled web audio data. We first describe a convolutional neural network (CNN) based framework for sound event detection and classification using weakly labeled audio data. Our model trains efficiently from audios of variable lengths which; hence, it is well suited for transfer learning. We then propose methods to learn representations using this model which can be effectively used for solving the target task. We study both transductive and inductive transfer learning tasks, showing the effectiveness of our methods for both domain and task adaptation. We show that even off-the-shelf representations using the proposed CNN model generalizes well enough to reach human level accuracy on ESC-50 sound events dataset. We further use them for acoustic scene classification task and once again show that our proposed approaches suits well for this task as well. Moreover, we show that our methods are helpful in capturing semantic meanings and relations as well.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset