Inter-channel Conv-TasNet for multichannel speech enhancement

11/08/2021
by   Dongheon Lee, et al.
0

Speech enhancement in multichannel settings has been realized by utilizing the spatial information embedded in multiple microphone signals. Moreover, deep neural networks (DNNs) have been recently advanced in this field; however, studies on the efficient multichannel network structure fully exploiting spatial information and inter-channel relationships is still in its early stages. In this study, we propose an end-to-end time-domain speech enhancement network that can facilitate the use of inter-channel relationships at individual layers of a DNN. The proposed technique is based on a fully convolutional time-domain audio separation network (Conv-TasNet), originally developed for speech separation tasks. We extend Conv-TasNet into several forms that can handle multichannel input signals and learn inter-channel relationships. To this end, we modify the encoder-mask-decoder structures of the network to be compatible with 3-D tensors defined over spatial channels, features, and time dimensions. In particular, we conduct extensive parameter analyses on the convolution structure and propose independent assignment of the depthwise and 1×1 convolution layers to the feature and spatial dimensions, respectively. We demonstrate that the enriched inter-channel information from the proposed network plays a significant role in suppressing noisy signals impinging from various directions. The proposed inter-channel Conv-TasNet outperforms the state-of-the-art multichannel variants of neural networks, even with one-tenth of their parameter size. The performance of the proposed model is evaluated using the CHiME-3 dataset, which exhibits a remarkable improvement in SDR, PESQ, and STOI.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 7

11/04/2020

DESNet: A Multi-channel Network for Simultaneous Speech Dereverberation, Enhancement and Separation

In this paper, we propose a multi-channel network for simultaneous speec...
02/13/2021

Multi-Channel Speech Enhancement using Graph Neural Networks

Multi-channel speech enhancement aims to extract clean speech from a noi...
06/24/2021

SRIB-LEAP submission to Far-field Multi-Channel Speech Enhancement Challenge for Video Conferencing

This paper presents the details of the SRIB-LEAP submission to the Confe...
07/23/2021

Multi-channel Speech Enhancement with 2-D Convolutional Time-frequency Domain Features and a Pre-trained Acoustic Model

We propose a multi-channel speech enhancement approach with a novel two-...
10/25/2021

Multichannel Speech Enhancement without Beamforming

Deep neural networks are often coupled with traditional spatial filters,...
04/28/2020

Neural Speech Separation Using Spatially Distributed Microphones

This paper proposes a neural network based speech separation method usin...
03/09/2020

Enhancing End-to-End Multi-channel Speech Separation via Spatial Feature Learning

Hand-crafted spatial features (e.g., inter-channel phase difference, IPD...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.