Efficient Low-Latency Speech Enhancement with Mobile Audio Streaming Networks

08/17/2020
by   Michał Romaniuk, et al.
0

We propose Mobile Audio Streaming Networks (MASnet) for efficient low-latency speech enhancement, which is particularly suitable for mobile devices and other applications where computational capacity is a limitation. MASnet processes linear-scale spectrograms, transforming successive noisy frames into complex-valued ratio masks which are then applied to the respective noisy frames. MASnet can operate in a low-latency incremental inference mode which matches the complexity of layer-by-layer batch mode. Compared to a similar fully-convolutional architecture, MASnet incorporates depthwise and pointwise convolutions for a large reduction in fused multiply-accumulate operations per second (FMA/s), at the cost of some reduction in SNR.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/03/2022

Iterative autoregression: a novel trick to improve your low-latency speech enhancement model

Streaming models are an essential component of real-time speech enhancem...
research
05/23/2020

Exploring the Best Loss Function for DNN-Based Low-latency Speech Enhancement with Temporal Convolutional Networks

Recently, deep neural networks (DNNs) have been successfully used for sp...
research
09/29/2019

FaSNet: Low-latency Adaptive Beamforming for Multi-microphone Audio Processing

Beamforming has been extensively investigated for multi-channel audio pr...
research
07/07/2021

SoundStream: An End-to-End Neural Audio Codec

We present SoundStream, a novel neural audio codec that can efficiently ...
research
11/08/2022

Cross-Attention is all you need: Real-Time Streaming Transformers for Personalised Speech Enhancement

Personalised speech enhancement (PSE), which extracts only the speech of...
research
10/21/2020

Real-time Speech Frequency Bandwidth Extension

In this paper we propose a lightweight model for frequency bandwidth ext...
research
11/14/2020

Communication-Cost Aware Microphone Selection For Neural Speech Enhancement with Ad-hoc Microphone Arrays

When performing multi-channel speech enhancement with a wireless acousti...

Please sign up or login with your details

Forgot password? Click here to reset