Active Bird2Vec: Towards End-to-End Bird Sound Monitoring with Transformers

08/14/2023
by   Lukas Rauch, et al.
1

We propose a shift towards end-to-end learning in bird sound monitoring by combining self-supervised (SSL) and deep active learning (DAL). Leveraging transformer models, we aim to bypass traditional spectrogram conversions, enabling direct raw audio processing. ActiveBird2Vec is set to generate high-quality bird sound representations through SSL, potentially accelerating the assessment of environmental changes and decision-making processes for wind farms. Additionally, we seek to utilize the wide variety of bird vocalizations through DAL, reducing the reliance on extensively labeled datasets by human experts. We plan to curate a comprehensive set of tasks through Huggingface Datasets, enhancing future comparability and reproducibility of bioacoustic research. A comparative analysis between various transformer models will be conducted to evaluate their proficiency in bird sound recognition tasks. We aim to accelerate the progression of avian bioacoustic research and contribute to more effective conservation strategies.

READ FULL TEXT
research
10/05/2021

Sound Event Detection Transformer: An Event-based End-to-End Model for Sound Event Detection

Sound event detection (SED) has gained increasing attention with its wid...
research
07/18/2022

Contrastive Environmental Sound Representation Learning

Machine hearing of the environmental sound is one of the important issue...
research
11/25/2022

Learning General Audio Representations with Large-Scale Training of Patchout Audio Transformers

The success of supervised deep learning methods is largely due to their ...
research
03/16/2022

A Squeeze-and-Excitation and Transformer based Cross-task System for Environmental Sound Recognition

Environmental sound recognition (ESR) is an emerging research topic in a...
research
05/01/2021

Audio Transformers:Transformer Architectures For Large Scale Audio Understanding. Adieu Convolutions

Over the past two decades, CNN architectures have produced compelling mo...
research
07/08/2022

BAST: Binaural Audio Spectrogram Transformer for Binaural Sound Localization

Accurate sound localization in a reverberation environment is essential ...
research
04/15/2020

ESResNet: Environmental Sound Classification Based on Visual Domain Models

Environmental Sound Classification (ESC) is an active research area in t...

Please sign up or login with your details

Forgot password? Click here to reset