Complexity Scaling for Speech Denoising

09/14/2023
by   Hangting Chen, et al.
0

Computational complexity is critical when deploying deep learning-based speech denoising models for on-device applications. Most prior research focused on optimizing model architectures to meet specific computational cost constraints, often creating distinct neural network architectures for different complexity limitations. This study conducts complexity scaling for speech denoising tasks, aiming to consolidate models with various complexities into a unified architecture. We present a Multi-Path Transform-based (MPT) architecture to handle both low- and high-complexity scenarios. A series of MPT networks present high performance covering a wide range of computational complexities on the DNS challenge dataset. Moreover, inspired by the scaling experiments in natural language processing, we explore the empirical relationship between model performance and computational cost on the denoising task. As the complexity number of multiply-accumulate operations (MACs) is scaled from 50M/s to 15G/s on MPT networks, we observe a linear increase in the values of PESQ-WB and SI-SNR, proportional to the logarithm of MACs, which might contribute to the understanding and application of complexity scaling in speech denoising tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/09/2022

MFDNet: Towards Real-time Image Denoising On Mobile Devices

Deep convolutional neural networks have achieved great progress in image...
research
05/30/2023

MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models

Self-supervised learning (SSL) is a popular research topic in speech pro...
research
01/29/2018

On Psychoacoustically Weighted Cost Functions Towards Resource-Efficient Deep Neural Networks for Speech Denoising

We present a psychoacoustically enhanced cost function to balance networ...
research
05/31/2019

Increasing Compactness Of Deep Learning Based Speech Enhancement Models With Parameter Pruning And Quantization Techniques

Most recent studies on deep learning based speech enhancement (SE) focus...
research
06/30/2021

DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement

Single-channel speech enhancement (SE) is an important task in speech pr...
research
11/20/2019

Fast and Flexible Image Blind Denoising via Competition of Experts

Fast and flexible processing are two essential requirements for a number...
research
08/11/2020

Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems

LPCNet is an efficient vocoder that combines linear prediction and deep ...

Please sign up or login with your details

Forgot password? Click here to reset