SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context

by   Mark Cartwright, et al.

We present SONYC-UST-V2, a dataset for urban sound tagging with spatiotemporal information. This dataset is aimed for the development and evaluation of machine listening systems for real-world urban noise monitoring. While datasets of urban recordings are available, this dataset provides the opportunity to investigate how spatiotemporal metadata can aid in the prediction of urban sound tags. SONYC-UST-V2 consists of 18510 audio recordings from the "Sounds of New York City" (SONYC) acoustic sensor network, including the timestamp of audio acquisition and location of the sensor. The dataset contains annotations by volunteers from the Zooniverse citizen science platform, as well as a two-stage verification with our team. In this article, we describe our data collection procedure and propose evaluation metrics for multilabel classification of urban sound tags. We report the results of a simple baseline model that exploits spatiotemporal information.


page 2

page 3


CRNNs for Urban Sound Tagging with spatiotemporal context

This paper describes CRNNs we used to participate in Task 5 of the DCASE...

A Strongly-Labelled Polyphonic Dataset of Urban Sounds with Spatiotemporal Context

This paper introduces SINGA:PURA, a strongly labelled polyphonic urban s...

City-Identification of Flickr Videos Using Semantic Acoustic Features

City-identification of videos aims to determine the likelihood of a vide...

Quantifying the presence of graffiti in urban environments

Graffiti is a common phenomenon in urban scenarios. Differently from urb...

Urban Sound Tagging using Convolutional Neural Networks

In this paper, we propose a framework for environmental sound classifica...

Acoustic Sounds for Wellbeing: A Novel Dataset and Baseline Results

The field of sound healing includes ancient practices coming from a broa...

Please sign up or login with your details

Forgot password? Click here to reset