A Generalized Bandsplit Neural Network for Cinematic Audio Source Separation

09/05/2023
by   Karn N. Watcharasupat, et al.
0

Cinematic audio source separation is a relatively new subtask of audio source separation, with the aim of extracting the dialogue stem, the music stem, and the effects stem from their mixture. In this work, we developed a model generalizing the Bandsplit RNN for any complete or overcomplete partitions of the frequency axis. Psycho-acoustically motivated frequency scales were used to inform the band definitions which are now defined with redundancy for more reliable feature extraction. A loss function motivated by the signal-to-noise ratio and the sparsity-promoting property of the 1-norm was proposed. We additionally exploit the information-sharing property of a common-encoder setup to reduce computational complexity during both training and inference, improve separation performance for hard-to-generalize classes of sounds, and allow flexibility during inference time with easily detachable decoders. Our best model sets the state of the art on the Divide and Remaster dataset with performance above the ideal ratio mask for the dialogue stem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2021

Multi-Task Audio Source Separation

The audio source separation tasks, such as speech enhancement, speech se...
research
05/10/2021

Sampling-Frequency-Independent Audio Source Separation Using Convolution Layer Based on Impulse Invariant Method

Audio source separation is often used as preprocessing of various applic...
research
04/17/2018

The 2018 Signal Separation Evaluation Campaign

This paper reports the organization and results for the 2018 community-b...
research
06/05/2022

Sampling Frequency Independent Dialogue Separation

In some DNNs for audio source separation, the relevant model parameters ...
research
10/23/2020

GSEP: A robust vocal and accompaniment separation system using gated CBHG module and loudness normalization

In the field of audio signal processing research, source separation has ...
research
06/29/2017

Multi-scale Multi-band DenseNets for Audio Source Separation

This paper deals with the problem of audio source separation. To handle ...
research
04/08/2019

Audio Source Separation via Multi-Scale Learning with Dilated Dense U-Nets

Modern audio source separation techniques rely on optimizing sequence mo...

Please sign up or login with your details

Forgot password? Click here to reset