Long-distance Detection of Bioacoustic Events with Per-channel Energy Normalization

11/01/2019
by   Vincent Lostanlen, et al.
0

This paper proposes to perform unsupervised detection of bioacoustic events by pooling the magnitudes of spectrogram frames after per-channel energy normalization (PCEN). Although PCEN was originally developed for speech recognition, it also has beneficial effects in enhancing animal vocalizations, despite the presence of atmospheric absorption and intermittent noise. We prove that PCEN generalizes logarithm-based spectral flux, yet with a tunable time scale for background noise estimation. In comparison with pointwise logarithm, PCEN reduces false alarm rate by 50x in the near field and 5x in the far field, both on avian and marine bioacoustic datasets. Such improvements come at moderate computational cost and require no human intervention, thus heralding a promising future for PCEN in bioacoustics.

READ FULL TEXT
research
09/24/2021

Parameterized Channel Normalization for Far-field Deep Speaker Verification

We address far-field speaker verification with deep neural network (DNN)...
research
07/19/2016

Trainable Frontend For Robust and Far-Field Keyword Spotting

Robust and far-field speech recognition is critical to enable true hands...
research
08/30/2021

Multi-Channel Transformer Transducer for Speech Recognition

Multi-channel inputs offer several advantages over single-channel, to im...
research
03/31/2021

TS-RIR: Translated synthetic room impulse responses for speech augmentation

We present a method for improving the quality of synthetic room impulse ...
research
09/30/2022

Blind Signal Dereverberation for Machine Speech Recognition

We present a method to remove unknown convolutive noise introduced to sp...
research
02/06/2021

Sound Event Detection in Urban Audio With Single and Multi-Rate PCEN

Recent literature has demonstrated that the use of per-channel energy no...

Please sign up or login with your details

Forgot password? Click here to reset