End-to-End Multi-Channel Transformer for Speech Recognition

02/08/2021
by   Feng-Ju Chang, et al.
0

Transformers are powerful neural architectures that allow integrating different modalities using attention mechanisms. In this paper, we leverage the neural transformer architectures for multi-channel speech recognition systems, where the spectral and spatial information collected from different microphones are integrated using attention layers. Our multi-channel transformer network mainly consists of three parts: channel-wise self attention layers (CSA), cross-channel attention layers (CCA), and multi-channel encoder-decoder attention layers (EDA). The CSA and CCA layers encode the contextual relationship within and between channels and across time, respectively. The channel-attended outputs from CSA and CCA are then fed into the EDA layers to help decode the next token given the preceding ones. The experiments show that in a far-field in-house dataset, our method outperforms the baseline single-channel transformer, as well as the super-directive and neural beamformers cascaded with the transformers.

READ FULL TEXT
research
02/10/2020

End-to-End Multi-speaker Speech Recognition with Transformer

Recently, fully recurrent neural network (RNN) based end-to-end models h...
research
03/11/2023

Stabilizing Transformer Training by Preventing Attention Entropy Collapse

Training stability is of great importance to Transformers. In this work,...
research
08/30/2021

Multi-Channel Transformer Transducer for Speech Recognition

Multi-channel inputs offer several advantages over single-channel, to im...
research
12/15/2021

EEG-Transformer: Self-attention from Transformer Architecture for Decoding EEG of Imagined Speech

Transformers are groundbreaking architectures that have changed a flow o...
research
02/19/2022

Multi-Channel FFT Architectures Designed via Folding and Interleaving

Computing the FFT of a single channel is well understood in the literatu...
research
02/21/2023

DasFormer: Deep Alternating Spectrogram Transformer for Multi/Single-Channel Speech Separation

For the task of speech separation, previous study usually treats multi-c...
research
06/23/2023

The Double Helix inside the NLP Transformer

We introduce a framework for analyzing various types of information in a...

Please sign up or login with your details

Forgot password? Click here to reset