Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement

07/25/2020
by   Jun Qi, et al.
0

This paper investigates different trade-offs between the number of model parameters and enhanced speech qualities by employing several deep tensor-to-vector regression models for speech enhancement. We find that a hybrid architecture, namely CNN-TT, is capable of maintaining a good quality performance with a reduced model parameter size. CNN-TT is composed of several convolutional layers at the bottom for feature extraction to improve speech quality and a tensor-train (TT) output layer on the top to reduce model parameters. We first derive a new upper bound on the generalization power of the convolutional neural network (CNN) based vector-to-vector regression models. Then, we provide experimental evidence on the Edinburgh noisy speech corpus to demonstrate that, in single-channel speech enhancement, CNN outperforms DNN at the expense of a small increment of model sizes. Besides, CNN-TT slightly outperforms the CNN counterpart by utilizing only 32% of the CNN model parameters. Besides, further performance improvement can be attained if the number of CNN-TT parameters is increased to 44% of the CNN model size. Finally, our experiments of multi-channel speech enhancement on a simulated noisy WSJ0 corpus demonstrate that our proposed hybrid CNN-TT architecture achieves better results than both DNN and CNN models in terms of better-enhanced speech qualities and smaller parameter sizes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/03/2020

Tensor-to-Vector Regression for Multi-channel Speech Enhancement based on Tensor-Train Network

We propose a tensor-to-vector regression approach to multi-channel speec...
research
06/09/2020

A fully recurrent feature extraction for single channel speech enhancement

Convolutional neural network (CNN) modules are widely being used to buil...
research
01/11/2022

Exploiting Hybrid Models of Tensor-Train Networks for Spoken Command Recognition

This work aims to design a low complexity spoken command recognition (SC...
research
03/11/2022

Exploiting Low-Rank Tensor-Train Deep Neural Networks Based on Riemannian Gradient Descent With Illustrations of Speech Processing

This work focuses on designing low complexity hybrid tensor networks by ...
research
07/28/2023

PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement

Convolutional neural networks (CNN) and Transformer have wildly succeede...
research
07/03/2019

Convolutional Neural Network-based Speech Enhancement for Cochlear Implant Recipients

Attempts to develop speech enhancement algorithms with improved speech i...
research
05/06/2021

Speech Enhancement using Separable Polling Attention and Global Layer Normalization followed with PReLU

Single channel speech enhancement is a challenging task in speech commun...

Please sign up or login with your details

Forgot password? Click here to reset