Controlling the Remixing of Separated Dialogue with a Non-Intrusive Quality Estimate

07/21/2021
by   Matteo Torcoli, et al.
0

Remixing separated audio sources trades off interferer attenuation against the amount of audible deteriorations. This paper proposes a non-intrusive audio quality estimation method for controlling this trade-off in a signal-adaptive manner. The recently proposed 2f-model is adopted as the underlying quality measure, since it has been shown to correlate strongly with basic audio quality in source separation. An alternative operation mode of the measure is proposed, more appropriate when considering material with long inactive periods of the target source. The 2f-model requires the reference target source as an input, but this is not available in many applications. Deep neural networks (DNNs) are trained to estimate the 2f-model intrusively using the reference target (iDNN2f), non-intrusively using the input mix as reference (nDNN2f), and reference-free using only the separated output signal (rDNN2f). It is shown that iDNN2f achieves very strong correlation with the original measure on the test data (Pearson r=0.99), while performance decreases for nDNN2f (r>=0.91) and rDNN2f (r>=0.82). The non-intrusive estimate nDNN2f is mapped to select item-dependent remixing gains with the aim of maximizing the interferer attenuation under a constraint on the minimum quality of the remixed output (e.g., audible but not annoying deteriorations). A listening test shows that this is successfully achieved even with very different selected gains (up to 23 dB difference).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2018

Referenceless Performance Evaluation of Audio Source Separation using Deep Neural Networks

Current performance evaluation for audio source separation depends on co...
research
10/23/2019

Model selection for deep audio source separation via clustering analysis

Audio source separation is the process of separating a mixture (e.g. a p...
research
06/05/2022

Sampling Frequency Independent Dialogue Separation

In some DNNs for audio source separation, the relevant model parameters ...
research
07/22/2021

Controlling the Perceived Sound Quality for Dialogue Enhancement with Deep Learning

Speech enhancement attenuates interfering sounds in speech signals but m...
research
12/22/2014

Audio Source Separation Using a Deep Autoencoder

This paper proposes a novel framework for unsupervised audio source sepa...
research
06/01/2020

Similarity-and-Independence-Aware Beamformer: Method for Target Source Extraction using Magnitude Spectrogram as Reference

This study presents a novel method called the similarity-and-independenc...
research
05/30/2023

Predicting Preferred Dialogue-to-Background Loudness Difference in Dialogue-Separated Audio

Dialogue Enhancement (DE) enables the rebalancing of dialogue and backgr...

Please sign up or login with your details

Forgot password? Click here to reset