Sampling Frequency Independent Dialogue Separation

06/05/2022
by   Jouni Paulus, et al.
0

In some DNNs for audio source separation, the relevant model parameters are independent of the sampling frequency of the audio used for training. Considering the application of dialogue separation, this is shown for two DNN architectures: a U-Net and a fully-convolutional model. The models are trained with audio sampled at 8 kHz. The learned parameters are transferred to models for processing audio at 48 kHz. The separated audio sources are compared with the ones produced by the same model architectures trained with 48 kHz versions of the same training data. A listening test and computational measures show that there is no significant perceptual difference between the models trained with 8 kHz or with 48 kHz. This transferability of the learned parameters allows for a faster and computationally less costly training. It also enables using training datasets available at a lower sampling frequency than the one needed by the application at hand, or using data collections with multiple sampling frequencies.

READ FULL TEXT
research
05/10/2021

Sampling-Frequency-Independent Audio Source Separation Using Convolution Layer Based on Impulse Invariant Method

Audio source separation is often used as preprocessing of various applic...
research
06/10/2021

Independent Deeply Learned Tensor Analysis for Determined Audio Source Separation

We address the determined audio source separation problem in the time-fr...
research
09/05/2023

A Generalized Bandsplit Neural Network for Cinematic Audio Source Separation

Cinematic audio source separation is a relatively new subtask of audio s...
research
06/19/2023

Algorithms of Sampling-Frequency-Independent Layers for Non-integer Strides

In this paper, we propose algorithms for handling non-integer strides in...
research
07/21/2021

Controlling the Remixing of Separated Dialogue with a Non-Intrusive Quality Estimate

Remixing separated audio sources trades off interferer attenuation again...
research
05/30/2023

Predicting Preferred Dialogue-to-Background Loudness Difference in Dialogue-Separated Audio

Dialogue Enhancement (DE) enables the rebalancing of dialogue and backgr...
research
05/08/2023

A method for analyzing sampling jitter in audio equipment

A method for analyzing sampling jitter in audio equipment is proposed. T...

Please sign up or login with your details

Forgot password? Click here to reset