Eventness: Object Detection on Spectrograms for Temporal Localization of Audio Events

12/27/2017
by   Phuong Pham, et al.
0

In this paper, we introduce the concept of Eventness for audio event detection, which can, in part, be thought of as an analogue to Objectness from computer vision. The key observation behind the eventness concept is that audio events reveal themselves as 2-dimensional time-frequency patterns with specific textures and geometric structures in spectrograms. These time-frequency patterns can then be viewed analogously to objects occurring in natural images (with the exception that scaling and rotation invariance properties do not apply). With this key observation in mind, we pose the problem of detecting monophonic or polyphonic audio events as an equivalent visual object(s) detection problem under partial occlusion and clutter in spectrograms. We adapt a state-of-the-art visual object detection model to evaluate the audio event detection task on publicly available datasets. The proposed network has comparable results with a state-of-the-art baseline and is more robust on minority events. Provided large-scale datasets, we hope that our proposed conceptual model of eventness will be beneficial to the audio signal processing community towards improving performance of audio event detection.

READ FULL TEXT

page 2

page 3

research
03/22/2023

Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline

Existing audio-visual event localization (AVE) handles manually trimmed ...
research
09/25/2008

Audio Classification from Time-Frequency Texture

Time-frequency representations of audio signals often resemble texture i...
research
11/25/2021

Polyphonic Sound Event Detection Using Capsule Neural Network on Multi-Type-Multi-Scale Time-Frequency Representation

The challenges of polyphonic sound event detection (PSED) stem from the ...
research
12/06/2017

Enabling Early Audio Event Detection with Neural Networks

This paper presents a methodology for early detection of audio events fr...
research
01/03/2017

AENet: Learning Deep Audio Features for Video Analysis

We propose a new deep network for audio event recognition, called AENet....
research
04/23/2020

Flexible framework for audio restoration

The paper presents a unified, flexible framework for the tasks of audio ...
research
07/12/2013

Speedy Object Detection based on Shape

This study is a part of design of an audio system for in-house object de...

Please sign up or login with your details

Forgot password? Click here to reset