Surrey-cvssp system for DCASE2017 challenge task4

09/02/2017

∙

In this technique report, we present a bunch of methods for the task 4 of Detection and Classification of Acoustic Scenes and Events 2017 (DCASE2017) challenge. This task evaluates systems for the large-scale detection of sound events using weakly labeled training data. The data are YouTube video excerpts focusing on transportation and warnings due to their industry applications. There are two tasks, audio tagging and sound event detection from weakly labeled data. Convolutional neural network (CNN) and gated recurrent unit (GRU) based recurrent neural network (RNN) are adopted as our basic framework. We proposed a learnable gating activation function for selecting informative local features. Attention-based scheme is used for localizing the specific events in a weakly-supervised mode. A new batch-level balancing strategy is also proposed to tackle the data unbalancing problem. Fusion of posteriors from different systems are found effective to improve the performance. In a summary, we get 61 sound event detection subtask on the development set. While the official multilayer perceptron (MLP) based baseline just obtained 13.1 audio tagging and 1.02 for the sound event detection.

READ FULL TEXT

Surrey-cvssp system for DCASE2017 challenge task4

Sign in with Google

Consider DeepAI Pro