AVECL-UMONS database for audio-visual event classification and localization

10/02/2020
by   Mathilde Brousmiche, et al.
0

We introduce the AVECL-UMons dataset for audio-visual event classification and localization in the context of office environments. The audio-visual dataset is composed of 11 event classes recorded at several realistic positions in two different rooms. Two types of sequences are recorded according to the number of events in the sequence. The dataset comprises 2662 unilabel sequences and 2724 multilabel sequences corresponding to a total of 5.24 hours. The dataset is publicly accessible online : https://zenodo.org/record/3965492#.X09wsobgrCI.

READ FULL TEXT
research
06/15/2023

STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events

While direction of arrival (DOA) of sound events is generally estimated ...
research
04/07/2021

MPN: Multimodal Parallel Network for Audio-Visual Event Localization

Audio-visual event localization aims to localize an event that is both a...
research
04/29/2016

Learning Compact Structural Representations for Audio Events Using Regressor Banks

We introduce a new learned descriptor for audio signals which is efficie...
research
11/28/2018

Large Scale Audio-Visual Video Analytics Platform for Forensic Investigations of Terroristic Attacks

The forensic investigation of a terrorist attack poses a huge challenge ...
research
03/22/2022

CT-SAT: Contextual Transformer for Sequential Audio Tagging

Sequential audio event tagging can provide not only the type information...
research
12/17/2020

Multi-shot Temporal Event Localization: a Benchmark

Current developments in temporal event or action localization usually ta...
research
12/21/2021

Decompose the Sounds and Pixels, Recompose the Events

In this paper, we propose a framework centering around a novel architect...

Please sign up or login with your details

Forgot password? Click here to reset