Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure

06/14/2023
by   Weidong Ji, et al.
0

To address the issue of poor generalization ability in end-to-end speech recognition models within deep learning, this study proposes a new Conformer-based speech recognition model called "Conformer-R" that incorporates the R-drop structure. This model combines the Conformer model, which has shown promising results in speech recognition, with the R-drop structure. By doing so, the model is able to effectively model both local and global speech information while also reducing overfitting through the use of the R-drop structure. This enhances the model's ability to generalize and improves overall recognition efficiency. The model was first pre-trained on the Aishell1 and Wenetspeech datasets for general domain adaptation, and subsequently fine-tuned on computer-related audio data. Comparison tests with classic models such as LAS and Wenet were performed on the same test set, demonstrating the Conformer-R model's ability to effectively improve generalization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/03/2022

Deep Speech Based End-to-End Automated Speech Recognition (ASR) for Indian-English Accents

Automated Speech Recognition (ASR) is an interdisciplinary application o...
research
09/04/2023

SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge

Recently, excellent progress has been made in speech recognition. Howeve...
research
11/15/2019

Sample Drop Detection for Distant-speech Recognition with Asynchronous Devices Distributed in Space

In many applications of multi-microphone multi-device processing, the sy...
research
11/12/2020

The CUHK-TUDELFT System for The SLT 2021 Children Speech Recognition Challenge

This technical report describes our submission to the 2021 SLT Children ...
research
10/24/2022

ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition

Speech recognition applications cover a range of different audio and tex...
research
02/20/2021

The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods

The variety of accents has posed a big challenge to speech recognition. ...
research
05/05/2020

End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Layer-wise Transfer Learning

Whispering is an important mode of human speech, but no end-to-end recog...

Please sign up or login with your details

Forgot password? Click here to reset