Tensor-to-Vector Regression for Multi-channel Speech Enhancement based on Tensor-Train Network

02/03/2020
by   Jun Qi, et al.
0

We propose a tensor-to-vector regression approach to multi-channel speech enhancement in order to address the issue of input size explosion and hidden-layer size expansion. The key idea is to cast the conventional deep neural network (DNN) based vector-to-vector regression formulation under a tensor-train network (TTN) framework. TTN is a recently emerged solution for compact representation of deep models with fully connected hidden layers. Thus TTN maintains DNN's expressive power yet involves a much smaller amount of trainable parameters. Furthermore, TTN can handle a multi-dimensional tensor input by design, which exactly matches the desired setting in multi-channel speech enhancement. We first provide a theoretical extension from DNN to TTN based regression. Next, we show that TTN can attain speech enhancement quality comparable with that for DNN but with much fewer parameters, e.g., a reduction from 27 million to only 5 million parameters is observed in a single-channel scenario. TTN also improves PESQ over DNN from 2.86 to 2.96 by slightly increasing the number of trainable parameters. Finally, in 8-channel conditions, a PESQ of 3.12 is achieved using 20 million parameters for TTN, whereas a DNN with 68 million parameters can only attain a PESQ of 3.06. Our implementation is available online https://github.com/uwjunqi/Tensor-Train-Neural-Network.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/25/2020

Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement

This paper investigates different trade-offs between the number of model...
research
12/01/2022

Deep neural network techniques for monaural speech enhancement: state of the art analysis

Deep neural networks (DNN) techniques have become pervasive in domains s...
research
12/25/2018

Tensor-Train Long Short-Term Memory for Monaural Speech Enhancement

In recent years, Long Short-Term Memory (LSTM) has become a popular choi...
research
03/11/2022

Exploiting Low-Rank Tensor-Train Deep Neural Networks Based on Riemannian Gradient Descent With Illustrations of Speech Processing

This work focuses on designing low complexity hybrid tensor networks by ...
research
02/24/2022

Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge

This paper describes our submission to the L3DAS22 Challenge Task 1, whi...
research
02/14/2020

Stable Training of DNN for Speech Enhancement based on Perceptually-Motivated Black-Box Cost Function

Improving subjective sound quality of enhanced signals is one of the mos...
research
06/01/2023

A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models

We propose a multi-dimensional structured state space (S4) approach to s...

Please sign up or login with your details

Forgot password? Click here to reset