Stereo Speech Enhancement Using Custom Mid-Side Signals and Monaural Processing

11/25/2022
by   Aaron Master, et al.
0

Speech Enhancement (SE) systems typically operate on monaural input and are used for applications including voice communications and capture cleanup for user generated content. Recent advancements and changes in the devices used for these applications are likely to lead to an increase in the amount of two-channel content for the same applications. However, SE systems are typically designed for monaural input; stereo results produced using trivial methods such as channel independent or mid-side processing may be unsatisfactory, including substantial speech distortions. To address this, we propose a system which creates a novel representation of stereo signals called Custom Mid-Side Signals (CMSS). CMSS allow benefits of mid-side signals for center-panned speech to be extended to a much larger class of input signals. This in turn allows any existing monaural SE system to operate as an efficient stereo system by processing the custom mid signal. We describe how the parameters needed for CMSS can be efficiently estimated by a component of the spatio-level filtering source separation system. Subjective listening using state-of-the-art deep learning-based SE systems on stereo content with various speech mixing styles shows that CMSS processing leads to improved speech quality at approximately half the cost of channel-independent processing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/07/2020

ESPnet-se: end-to-end speech enhancement and separation toolkit designed for asr integration

We present ESPnet-SE, which is designed for the quick development of spe...
research
10/19/2021

Speech Enhancement-assisted Stargan Voice Conversion in Noisy Environments

Numerous voice conversion (VC) techniques have been proposed for the con...
research
12/09/2021

A Training Framework for Stereo-Aware Speech Enhancement using Deep Neural Networks

Deep learning-based speech enhancement has shown unprecedented performan...
research
08/31/2018

Single-Microphone Speech Enhancement and Separation Using Deep Learning

The cocktail party problem comprises the challenging task of understandi...
research
08/21/2020

CITISEN: A Deep Learning-Based Speech Signal-Processing Mobile Application

In this paper, we present a deep learning-based speech signal-processing...
research
11/08/2018

Speech Enhancement Based on Reducing the Detail Portion of Speech Spectrograms in Modulation Domain via Discrete Wavelet Transform

In this paper, we propose a novel speech enhancement (SE) method by expl...
research
06/05/2022

Geometrically-Motivated Primary-Ambient Decomposition With Center-Channel Extraction

A geometrically-motivated method for primary-ambient decomposition is pr...

Please sign up or login with your details

Forgot password? Click here to reset