Surrey-cvssp system for DCASE2017 challenge task4

09/02/2017
by   Yong Xu, et al.
0

In this technique report, we present a bunch of methods for the task 4 of Detection and Classification of Acoustic Scenes and Events 2017 (DCASE2017) challenge. This task evaluates systems for the large-scale detection of sound events using weakly labeled training data. The data are YouTube video excerpts focusing on transportation and warnings due to their industry applications. There are two tasks, audio tagging and sound event detection from weakly labeled data. Convolutional neural network (CNN) and gated recurrent unit (GRU) based recurrent neural network (RNN) are adopted as our basic framework. We proposed a learnable gating activation function for selecting informative local features. Attention-based scheme is used for localizing the specific events in a weakly-supervised mode. A new batch-level balancing strategy is also proposed to tackle the data unbalancing problem. Fusion of posteriors from different systems are found effective to improve the performance. In a summary, we get 61 sound event detection subtask on the development set. While the official multilayer perceptron (MLP) based baseline just obtained 13.1 audio tagging and 1.02 for the sound event detection.

READ FULL TEXT
research
10/01/2017

Large-scale weakly supervised audio classification using gated convolutional neural network

In this paper, we present a gated convolutional neural network and a tem...
research
07/27/2018

Large-Scale Weakly Labeled Semi-Supervised Sound Event Detection in Domestic Environments

This paper presents DCASE 2018 task 4. The task evaluates systems for th...
research
06/21/2021

Affinity Mixup for Weakly Supervised Sound Event Detection

The weakly supervised sound event detection problem is the task of predi...
research
12/27/2017

Multiple Instance Deep Learning for Weakly Supervised Audio Event Detection

State-of-the-art audio event detection (AED) systems rely on supervised ...
research
09/26/2022

Multi-encoder attention-based architectures for sound recognition with partial visual assistance

Large-scale sound recognition data sets typically consist of acoustic re...
research
12/27/2017

Multiple Instance Deep Learning for Weakly Supervised Small-Footprint Audio Event Detection

State-of-the-art audio event detection (AED) systems rely on supervised ...

Please sign up or login with your details

Forgot password? Click here to reset