DiffSED: Sound Event Detection with Denoising Diffusion

08/14/2023
by   Swapnil Bhosale, et al.
0

Sound Event Detection (SED) aims to predict the temporal boundaries of all the events of interest and their class labels, given an unconstrained audio sample. Taking either the splitand-classify (i.e., frame-level) strategy or the more principled event-level modeling approach, all existing methods consider the SED problem from the discriminative learning perspective. In this work, we reformulate the SED problem by taking a generative learning perspective. Specifically, we aim to generate sound temporal boundaries from noisy proposals in a denoising diffusion process, conditioned on a target audio sample. During training, our model learns to reverse the noising process by converting noisy latent queries to the groundtruth versions in the elegant Transformer decoder framework. Doing so enables the model generate accurate event boundaries from even noisy queries during inference. Extensive experiments on the Urban-SED and EPIC-Sounds datasets demonstrate that our model significantly outperforms existing alternatives, with 40+

READ FULL TEXT
research
03/27/2023

DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion

We propose a new formulation of temporal action detection (TAD) with den...
research
10/05/2021

Sound Event Detection Transformer: An Event-based End-to-End Model for Sound Event Detection

Sound event detection (SED) has gained increasing attention with its wid...
research
10/18/2022

A Hybrid System of Sound Event Detection Transformer and Frame-wise Model for DCASE 2022 Task 4

In this paper, we describe in detail our system for DCASE 2022 Task4. Th...
research
05/22/2023

DiffusionNER: Boundary Diffusion for Named Entity Recognition

In this paper, we propose DiffusionNER, which formulates the named entit...
research
08/22/2023

Furnishing Sound Event Detection with Language Model Abilities

Recently, the ability of language models (LMs) has attracted increasing ...
research
08/20/2018

A simple model for detection of rare sound events

We propose a simple recurrent model for detecting rare sound events, whe...
research
10/22/2018

Connectionist Temporal Localization for Sound Event Detection with Sequential Labeling

Research on sound event detection (SED) with weak labeling has mostly fo...

Please sign up or login with your details

Forgot password? Click here to reset