Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition

03/23/2023
by   Haoyu Tang, et al.
0

Transformer-based models have recently made significant achievements in the application of end-to-end (E2E) automatic speech recognition (ASR). It is possible to deploy the E2E ASR system on smart devices with the help of Transformer-based models. While these models still have the disadvantage of requiring a large number of model parameters. To overcome the drawback of universal Transformer models for the application of ASR on edge devices, we propose a solution that can reuse the block in Transformer models for the occasion of the small footprint ASR system, which meets the objective of accommodating resource limitations without compromising recognition accuracy. Specifically, we design a novel block-reusing strategy for speech Transformer (BRST) to enhance the effectiveness of parameters and propose an adapter module (ADM) that can produce a compact and adaptable model with only a few additional trainable parameters accompanying each reusing block. We conducted an experiment with the proposed method on the public AISHELL-1 corpus, and the results show that the proposed approach achieves the character error rate (CER) of 9.3 respectively. In addition, we also make a deeper analysis to show the effect of ADM in the general block-reusing method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/09/2019

Fully Quantizing a Simplified Transformer for End-to-end Speech Recognition

While significant improvements have been made in recent years in terms o...
research
04/06/2021

Extremely Low Footprint End-to-End ASR System for Smart Device

Recently, end-to-end (E2E) speech recognition has become popular, since ...
research
05/24/2023

InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition

The local and global features are both essential for automatic speech re...
research
05/31/2023

Accurate and Structured Pruning for Efficient Automatic Speech Recognition

Automatic Speech Recognition (ASR) has seen remarkable advancements with...
research
07/24/2022

Improving Mandarin Speech Recogntion with Block-augmented Transformer

Recently Convolution-augmented Transformer (Conformer) has shown promisi...
research
07/05/2022

Compute Cost Amortized Transformer for Streaming ASR

We present a streaming, Transformer-based end-to-end automatic speech re...
research
08/07/2020

Pretraining Techniques for Sequence-to-Sequence Voice Conversion

Sequence-to-sequence (seq2seq) voice conversion (VC) models are attracti...

Please sign up or login with your details

Forgot password? Click here to reset