Cosine-similarity penalty to discriminate sound classes in weakly-supervised sound event detection

01/10/2019
by   Thomas Pellegrini, et al.
0

The design of new methods and models when only weakly-labeled data are available is of paramount importance in order to reduce the costs of manual annotation and the considerable human effort associated with it. In this work, we address Sound Event Detection in the case where a weakly annotated dataset is available for training. The weak annotations provide tags of audio events but do not provide temporal boundaries. The objective is twofold: 1) audio tagging, i.e. multi-label classification at recording level, 2) sound event detection, i.e. localization of the event boundaries within the recordings. This work focuses mainly on the second objective. We explore an approach inspired by Multiple Instance Learning, in which we train a convolutional recurrent neural network to give predictions at frame-level, using a custom loss function based on the weak labels and the statistics of the frame-based predictions. Since some sound classes cannot be distinguished with this approach, we improve the method by penalizing similarity between the predictions of the positive classes during training. On the test set used in the DCASE 2018 challenge, consisting of 288 recordings and 10 sound classes, the addition of a penalty resulted in a localization F-score of 33.42 brought more than 25 The approach also outperformed a false strong labeling baseline and an attention-based model.

READ FULL TEXT

page 1

page 7

research
10/01/2017

Large-scale weakly supervised audio classification using gated convolutional neural network

In this paper, we present a gated convolutional neural network and a tem...
research
10/05/2021

Sound Event Detection Transformer: An Event-based End-to-End Model for Sound Event Detection

Sound event detection (SED) has gained increasing attention with its wid...
research
02/20/2020

Multi-label Sound Event Retrieval Using a Deep Learning-based Siamese Structure with a Pairwise Presence Matrix

Realistic recordings of soundscapes often have multiple sound events co-...
research
10/22/2018

Connectionist Temporal Localization for Sound Event Detection with Sequential Labeling

Research on sound event detection (SED) with weak labeling has mostly fo...
research
02/12/2020

Active Learning for Sound Event Detection

This paper proposes an active learning system for sound event detection ...
research
02/15/2023

Unsupervised classification to improve the quality of a bird song recording dataset

Open audio databases such as Xeno-Canto are widely used to build dataset...
research
12/02/2019

Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events

We tackle the task of environmental event classification by drawing insp...

Please sign up or login with your details

Forgot password? Click here to reset