HumBugDB: A Large-scale Acoustic Mosquito Dataset

10/14/2021
by   Ivan Kiskin, et al.
0

This paper presents the first large-scale multi-species dataset of acoustic recordings of mosquitoes tracked continuously in free flight. We present 20 hours of audio recordings that we have expertly labelled and tagged precisely in time. Significantly, 18 hours of recordings contain annotations from 36 different species. Mosquitoes are well-known carriers of diseases such as malaria, dengue and yellow fever. Collecting this dataset is motivated by the need to assist applications which utilise mosquito acoustics to conduct surveys to help predict outbreaks and inform intervention policy. The task of detecting mosquitoes from the sound of their wingbeats is challenging due to the difficulty in collecting recordings from realistic scenarios. To address this, as part of the HumBug project, we conducted global experiments to record mosquitoes ranging from those bred in culture cages to mosquitoes captured in the wild. Consequently, the audio recordings vary in signal-to-noise ratio and contain a broad range of indoor and outdoor background environments from Tanzania, Thailand, Kenya, the USA and the UK. In this paper we describe in detail how we collected, labelled and curated the data. The data is provided from a PostgreSQL database, which contains important metadata such as the capture method, age, feeding status and gender of the mosquitoes. Additionally, we provide code to extract features and train Bayesian convolutional neural networks for two key tasks: the identification of mosquitoes from their corresponding background environments, and the classification of detected mosquitoes into species. Our extensive dataset is both challenging to machine learning researchers focusing on acoustic identification, and critical to entomologists, geo-spatial modellers and other domain experts to understand mosquito behaviour, model their distribution, and manage the threat they pose to humans.

READ FULL TEXT

page 5

page 19

page 21

page 25

page 26

page 29

page 30

research
05/24/2015

Detecting bird sound in unknown acoustic background using crowdsourced training data

Biodiversity monitoring using audio recordings is achievable at a truly ...
research
07/11/2023

AnuraSet: A dataset for benchmarking Neotropical anuran calls identification in passive acoustic monitoring

Global change is predicted to induce shifts in anuran acoustic behavior,...
research
11/06/2018

NIPS4Bplus: a richly annotated birdsong audio dataset

Recent advances in birdsong detection and classification have approached...
research
01/17/2018

NELS - Never-Ending Learner of Sounds

Sounds are essential to how humans perceive and interact with the world ...
research
01/13/2022

Fish sounds: towards the evaluation of marine acoustic biodiversity through data-driven audio source separation

The marine ecosystem is changing at an alarming rate, exhibiting biodive...
research
01/08/2019

Presence-absence estimation in audio recordings of tropical frog communities

One non-invasive way to study frog communities is by analyzing long-term...
research
06/28/2023

Improving Primate Sounds Classification using Binary Presorting for Deep Learning

In the field of wildlife observation and conservation, approaches involv...

Please sign up or login with your details

Forgot password? Click here to reset