A comparison of five multiple instance learning pooling functions for sound event detection with weak labeling

10/22/2018
by   Yun Wang, et al.
0

Sound event detection (SED) entails two subtasks: recognizing what types of sound events are present in an audio stream (audio tagging), and pinpointing their onset and offset times (localization). In the popular multiple instance learning (MIL) framework for SED with weak labeling, an important component is the pooling function. This paper compares five types of pooling functions both theoretically and experimentally, with special focus on their performance of localization. Although the attention pooling function is currently receiving the most attention, we find the linear softmax pooling function to perform the best among the five. Using this pooling function, we build a neural network called TALNet. It is the first system to reach state-of-the-art audio tagging performance on Audio Set, while exhibiting strong localization performance on the DCASE 2017 challenge at the same time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/03/2018

Comparing the Max and Noisy-Or Pooling Functions in Multiple Instance Learning for Weakly Supervised Sequence Learning Tasks

Many sequence learning tasks require the localization of certain events ...
research
10/22/2018

Connectionist Temporal Localization for Sound Event Detection with Sequential Labeling

Research on sound event detection (SED) with weak labeling has mostly fo...
research
10/20/2020

Power pooling: An adaptive pooling function for weakly labelled sound event detection

Access to large corpora with strongly labelled sound events is expensive...
research
02/03/2021

A Global-local Attention Framework for Weakly Labelled Audio Tagging

Weakly labelled audio tagging aims to predict the classes of sound event...
research
09/26/2022

Impact of temporal resolution on convolutional recurrent networks for audio tagging and sound event detection

Many state-of-the-art systems for audio tagging and sound event detectio...
research
05/24/2019

Specialized Decision Surface and Disentangled Feature for Weakly-Supervised Polyphonic Sound Event Detection

Sound event detection (SED) is to recognize the presence of sound events...
research
04/26/2018

Adaptive pooling operators for weakly labeled sound event detection

Sound event detection (SED) methods are tasked with labeling segments of...

Please sign up or login with your details

Forgot password? Click here to reset