Efficient Conformer with Prob-Sparse Attention Mechanism for End-to-EndSpeech Recognition

06/17/2021
by   Xiong Wang, et al.
0

End-to-end models are favored in automatic speech recognition (ASR) because of their simplified system structure and superior performance. Among these models, Transformer and Conformer have achieved state-of-the-art recognition accuracy in which self-attention plays a vital role in capturing important global information. However, the time and memory complexity of self-attention increases squarely with the length of the sentence. In this paper, a prob-sparse self-attention mechanism is introduced into Conformer to sparse the computing process of self-attention in order to accelerate inference speed and reduce space consumption. Specifically, we adopt a Kullback-Leibler divergence based sparsity measurement for each query to decide whether we compute the attention function on this query. By using the prob-sparse attention mechanism, we achieve impressively 8 usage reduction of the self-attention module of Conformer Transducer while maintaining the same level of error rate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2022

Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition

The Transformer architecture model, based on self-attention and multi-he...
research
10/28/2019

DFSMN-SAN with Persistent Memory Model for Automatic Speech Recognition

Self-attention networks (SAN) have been introduced into automatic speech...
research
10/01/2019

State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D Convolutions

Self-attention has been a huge success for many downstream tasks in NLP,...
research
11/14/2022

Towards A Unified Conformer Structure: from ASR to ASV Task

Transformer has achieved extraordinary performance in Natural Language P...
research
06/01/2023

Faster Causal Attention Over Large Sequences Through Sparse Flash Attention

Transformer-based language models have found many diverse applications r...
research
10/24/2022

Composition, Attention, or Both?

In this paper, we propose a novel architecture called Composition Attent...
research
01/22/2022

glassoformer: a query-sparse transformer for post-fault power grid voltage prediction

We propose GLassoformer, a novel and efficient transformer architecture ...

Please sign up or login with your details

Forgot password? Click here to reset