Trainable Adaptive Window Switching for Speech Enhancement

11/05/2018
by   Yuma Koizumi, et al.
0

This study proposes a trainable adaptive window switching (AWS) method and apply it to a deep-neural-network (DNN) for speech enhancement in the modified discrete cosine transform domain. Time-frequency (T-F) mask processing in the short-time Fourier transform (STFT)-domain is a typical speech enhancement method. To recover the target signal precisely, DNN-based short-time frequency transforms have recently been investigated and used instead of the STFT. However, since such a fixed-resolution short-time frequency transform method has a T-F resolution problem based on the uncertainty principle, not only the short-time frequency transform but also the length of the windowing function should be optimized. To overcome this problem, we incorporate AWS into the speech enhancement procedure, and the windowing function of each time-frame is manipulated using a DNN depending on the input signal. We confirmed that the proposed method achieved a higher signal-to-distortion ratio than conventional speech enhancement methods in fixed-resolution frequency domains.

READ FULL TEXT

page 1

page 3

page 4

research
11/25/2019

Invertible DNN-based nonlinear time-frequency transform for speech enhancement

We propose an end-to-end speech enhancement method with trainable time-f...
research
02/20/2020

Efficient Trainable Front-Ends for Neural Speech Enhancement

Many neural speech enhancement and source separation systems operate in ...
research
05/23/2020

Exploring the Best Loss Function for DNN-Based Low-latency Speech Enhancement with Temporal Convolutional Networks

Recently, deep neural networks (DNNs) have been successfully used for sp...
research
09/02/2015

Enhancement and Recognition of Reverberant and Noisy Speech by Extending Its Coherence

Most speech enhancement algorithms make use of the short-time Fourier tr...
research
02/14/2020

Consistency-aware multi-channel speech enhancement using deep neural networks

This paper proposes a deep neural network (DNN)-based multi-channel spee...
research
05/21/2019

DNN-Based Multi-Frame MVDR Filtering for Single-Microphone Speech Enhancement

Multi-frame approaches for single-microphone speech enhancement, e.g., t...
research
04/15/2022

Improving Frame-Online Neural Speech Enhancement with Overlapped-Frame Prediction

Frame-online speech enhancement systems in the short-time Fourier transf...

Please sign up or login with your details

Forgot password? Click here to reset