Recursive Visual Sound Separation Using Minus-Plus Net

08/30/2019
by   Xudong Xu, et al.
3

Sounds provide rich semantics, complementary to visual data, for many tasks. However, in practice, sounds from multiple sources are often mixed together. In this paper we propose a novel framework, referred to as MinusPlus Network (MP-Net), for the task of visual sound separation. MP-Net separates sounds recursively in the order of average energy, removing the separated sound from the mixture at the end of each prediction, until the mixture becomes empty or contains only noise. In this way, MP-Net could be applied to sound mixtures with arbitrary numbers and types of sounds. Moreover, while MP-Net keeps removing sounds with large energy from the mixture, sounds with small energy could emerge and become clearer, so that the separation is more accurate. Compared to previous methods, MP-Net obtains state-of-the-art results on two large scale datasets, across mixtures with different types and numbers of sounds.

READ FULL TEXT

page 3

page 4

page 6

page 7

research
11/02/2020

Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds

Recent progress in deep learning has enabled many advances in sound sepa...
research
05/05/2021

Self-Supervised Learning from Automatically Separated Sound Scenes

Real-world sound scenes consist of time-varying collections of sound sou...
research
11/05/2018

End-to-End Sound Source Separation Conditioned On Instrument Labels

Can we perform an end-to-end sound source separation (SSS) with a variab...
research
09/18/2021

V-SlowFast Network for Efficient Visual Sound Separation

The objective of this paper is to perform visual sound separation: i) we...
research
11/02/2020

What's All the FUSS About Free Universal Sound Separation Data?

We introduce the Free Universal Sound Separation (FUSS) dataset, a new c...
research
03/04/2022

Selective Pseudo-labeling and Class-wise Discriminative Fusion for Sound Event Detection

In recent years, exploring effective sound separation (SSep) techniques ...
research
11/28/2022

Mix and Localize: Localizing Sound Sources in Mixtures

We present a method for simultaneously localizing multiple sound sources...

Please sign up or login with your details

Forgot password? Click here to reset