ReZero: Region-customizable Sound Extraction

08/31/2023
by   Rongzhi Gu, et al.
0

We introduce region-customizable sound extraction (ReZero), a general and flexible framework for the multi-channel region-wise sound extraction (R-SE) task. R-SE task aims at extracting all active target sounds (e.g., human speech) within a specific, user-defined spatial region, which is different from conventional and existing tasks where a blind separation or a fixed, predefined spatial region are typically assumed. The spatial region can be defined as an angular window, a sphere, a cone, or other geometric patterns. Being a solution to the R-SE task, the proposed ReZero framework includes (1) definitions of different types of spatial regions, (2) methods for region feature extraction and aggregation, and (3) a multi-channel extension of the band-split RNN (BSRNN) model specified for the R-SE task. We design experiments for different microphone array geometries, different types of spatial regions, and comprehensive ablation studies on different system configurations. Experimental results on both simulated and real-recorded data demonstrate the effectiveness of ReZero. Demos are available at https://innerselfm.github.io/rezero/.

READ FULL TEXT

page 5

page 6

research
04/08/2022

SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning

In many situations, we would like to hear desired sound events (SEs) whi...
research
02/20/2023

Improving Speech Enhancement via Event-based Query

Existing deep learning based speech enhancement (SE) methods either use ...
research
08/04/2019

Sound Event Detection in Multichannel Audio using Convolutional Time-Frequency-Channel Squeeze and Excitation

In this study, we introduce a convolutional time-frequency-channel "Sque...
research
03/13/2023

Multi-Microphone Speaker Separation by Spatial Regions

We consider the task of region-based source separation of reverberant mu...
research
02/01/2022

New Insights on Target Speaker Extraction

In recent years, researchers have become increasingly interested in spea...
research
04/14/2021

Change Detection in Synthetic Aperture Radar Images Using a Dual-Domain Network

Change detection from synthetic aperture radar (SAR) imagery is a critic...
research
05/21/2019

Spatially Constrained Spectral Clustering Algorithms for Region Delineation

Regionalization is the task of dividing up a landscape into homogeneous ...

Please sign up or login with your details

Forgot password? Click here to reset