SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning

04/08/2022
by   Marc Delcroix, et al.
0

In many situations, we would like to hear desired sound events (SEs) while being able to ignore interference. Target sound extraction (TSE) aims at tackling this problem by estimating the sound of target SE classes in a mixture while suppressing all other sounds. We can achieve this with a neural network that extracts the target SEs by conditioning it on clues representing the target SE classes. Two types of clues have been proposed, i.e., target SE class labels and enrollment sound samples similar to the target sound. Systems based on SE class labels can directly optimize embedding vectors representing the SE classes, resulting in high extraction performance. However, extending these systems to the extraction of new SE classes not encountered during training is not easy. Enrollment-based approaches extract SEs by finding sounds in the mixtures that share similar characteristics to the enrollment. These approaches do not explicitly rely on SE class definitions and can thus handle new SE classes. In this paper, we introduce a TSE framework, SoundBeam, that combines the advantages of both approaches. We also perform an extensive evaluation of the different TSE schemes using synthesized and real mixtures, which shows the potential of SoundBeam.

READ FULL TEXT

page 1

page 10

research
06/14/2021

Few-shot learning of new sound classes for target sound extraction

Target sound extraction consists of extracting the sound of a target aco...
research
08/31/2023

ReZero: Region-customizable Sound Extraction

We introduce region-customizable sound extraction (ReZero), a general an...
research
04/02/2022

Improving Target Sound Extraction with Timestamp Information

Target sound extraction (TSE) aims to extract the sound part of a target...
research
02/01/2022

New Insights on Target Speaker Extraction

In recent years, researchers have become increasingly interested in spea...
research
03/08/2022

Locate This, Not That: Class-Conditioned Sound Event DOA Estimation

Existing systems for sound event localization and detection (SELD) typic...
research
06/10/2020

Listen to What You Want: Neural Network-based Universal Sound Selector

Being able to control the acoustic events (AEs) to which we want to list...
research
03/22/2005

Semi-automatic vectorization of linear networks on rasterized cartographic maps

A system for semi-automatic vectorization of linear networks (roads, riv...

Please sign up or login with your details

Forgot password? Click here to reset