Exploring Tradeoffs in Models for Low-latency Speech Enhancement

11/16/2018
by   Kevin Wilson, et al.
0

We explore a variety of neural networks configurations for one- and two-channel spectrogram-mask-based speech enhancement. Our best model improves on previous state-of-the-art performance on the CHiME2 speech enhancement task by 0.4 decibels in signal-to-distortion ratio (SDR). We examine trade-offs such as non-causal look-ahead, computation, and parameter count versus enhancement performance and find that zero-look-ahead models can achieve, on average, within 0.03 dB SDR of our best bidirectional model. Further, we find that 200 milliseconds of look-ahead is sufficient to achieve equivalent performance to our best bidirectional model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/07/2023

Causal Signal-Based DCCRN with Overlapped-Frame Prediction for Online Speech Enhancement

The aim of speech enhancement is to improve speech signal quality and in...
research
05/23/2020

Exploring the Best Loss Function for DNN-Based Low-latency Speech Enhancement with Temporal Convolutional Networks

Recently, deep neural networks (DNNs) have been successfully used for sp...
research
11/03/2022

Iterative autoregression: a novel trick to improve your low-latency speech enhancement model

Streaming models are an essential component of real-time speech enhancem...
research
02/26/2023

DFSNet: A Steerable Neural Beamformer Invariant to Microphone Array Configuration for Real-Time, Low-Latency Speech Enhancement

Invariance to microphone array configuration is a rare attribute in neur...
research
12/11/2018

R3-DLA (Reduce, Reuse, Recycle): A More Efficient Approach to Decoupled Look-Ahead Architectures

Modern societies have developed insatiable demands for more computation ...
research
11/06/2018

Kernel Machines Beat Deep Neural Networks on Mask-based Single-channel Speech Enhancement

We apply a fast kernel method for mask-based single-channel speech enhan...
research
11/03/2021

Weight, Block or Unit? Exploring Sparsity Tradeoffs for Speech Enhancement on Tiny Neural Accelerators

We explore network sparsification strategies with the aim of compressing...

Please sign up or login with your details

Forgot password? Click here to reset